Wiley Series in Discrete Mathematics and Optimization 
; iin, 


< | 

4 

maw A | 
& id 


introduction to 
Combinatorics 


SECOND EDITION 


MARTIN J. ERICKSON: 


WILEY 


Contents 


Cover 


Half Title page 
Title page 
Copyright page 


Dedication 


Preface 


Chapter 1: Basic Counting Methods 
1.1 The multiplication principle 


1.2 Permutations 

1.3 Combinations 

1.4 Binomial coefficient identities 

1.5 Distributions 

1.6 The principle of inclusion and exclusion 
1.7 Fibonacci numbers 

1.8 Linear recurrence relations 


1.9 Special recurrence relations 


1.10 Counting and number theory 
Notes 


Chapter 2: Generating Functions 
2.1 Rational generating functions 


References 


Index 


Introduction to Combinatorics 


WILEY SERIES IN DISCRETE 
MATHEMATICS AND OPTIMIZATION 


AARTS AND KORST « Simulated Annealing and Boltzmann Machines: A 
Stochastic Approach to Combinatorial Optimization and Neural Computing 
AARTS AND LENSTRA * Local Search in Combinatorial Optimization ALON 
AND SPENCER «° The Probabilistic Method, Third Edition 

ANDERSON AND NASH «© Linear Programming in Infinite-Dimensional 
Spaces: Theory and Application ARLINGHAUS, ARLINGHAUS, AND 
HARARY « Graph Theory and Geography: An Interactive View E-Book 
AZENCOTT « Simulated Annealing: Parallelization Techniques 
BARTHELEMY AND GUENOCHE »* Trees and Proximity Representations 
BAZARRA, JARVIS, AND SHERALI « Linear Programming and Network 
Flows BRUEN AND FORCINITO * Cryptography, Information Theory, and 
Error-Correction: A Handbook for the 21st Century CHANDRU AND 
HOOKER « Optimization Methods for Logical Inference CHONG AND ZAK *« 
An Introduction to Optimization, Fourth Edition COFFMAN AND LUEKER » 
Probabilistic Analysis of Packing and Partitioning Algorithms COOK, 
CUNNINGHAM, PULLEYBLANK, AND SCHRIJVER «+ Combinatorial 
Optimization DASKIN « Network and Discrete Location: Modes, Algorithms 
and Applications DINITZ AND STINSON * Contemporary Design Theory: A 
Collection of Surveys DU AND KO * Theory of Computational Complexity 
ERICKSON «° Introduction to Combinatorics, Second Edition 

GLOVER, KLINGHAM, AND PHILLIPS « Network Models in Optimization 
and Their Practical Problems GOLSHTEIN AND TRETYAKOV »* Modified 
Lagrangians and Monotone Maps in Optimization GONDRAN AND MINOUX 
¢ Graphs and Algorithms (Translated by S. Vajda) GRAHAM, ROTHSCHILD, 
AND SPENCER ° Ramsey Theory, Second Edition GROSS AND TUCKER » 
Topological Graph Theory 

HALL ¢ Combinatorial Theory, Second Edition 

HOOKER « Logic-Based Methods for Optimization: Combining Optimization 
and Constraint Satisfaction IMRICH AND KLAVZAR © Product Graphs: 
Structure and Recognition JANSON, LUCZAK, AND RUCINSKI * Random 
Graphs 


JENSEN AND TOFT ¢ Graph Coloring Problems 

KAPLAN * Maxima and Minima with Applications: Practical Optimization and 
Duality LAWLER, LENSTRA, RINNOOY KAN, AND SHMOYS, Editors 
The ‘Traveling Salesman Problem: A Guided Tour of Combinatorial 
Optimization LAYWINE AND MULLEN « Discrete Mathematics Using Latin 
Squares LEVITIN © Perturbation Theory in Mathematical Programming 
Applications MAHMOUD ° Evolution of Random Search Trees 

MAHMOUD « Sorting: A Distribution Theory 

MARTELLI « Introduction to Discrete Dynamical Systems and Chaos 
MARTELLO AND TOTH « Knapsack Problems: Algorithms and Computer 
Implementations McALOON AND TRETKOFF «= Optimization and 
Computational Logic 

MERRIS ¢ Combinatorics, Second Edition 

MERRIS « Graph Theory 

MINC ¢ Nonnegative Matrices 

MINOUX « Mathematical Programming: Theory and Algorithms (Translated by 
S. Vajda) MIRCHANDANI AND FRANCIS, Editors « Discrete Location Theory 
NEMHAUSER AND WOLSEY ~« Integer and Combinatorial Optimization 
NEMIROVSKY AND YUDIN ° Problem Complexity and Method Efficiency in 
Optimization (Translated by E. R. Dawson) PACH AND AGARWAL =» 
Combinatorial Geometry 

PLESS « Introduction to the Theory of Error-Correcting Codes, Third Edition 
ROOS AND VIAL ° Ph. Theory and Algorithms for Linear Optimization: An 
Interior Point Approach SCHEINERMAN AND ULLMAN «© Fractional Graph 
Theory: A Rational Approach to the Theory of Graphs SCHIFF « Cellular 
Automata: A Discrete View of the World 

SCHRIJVER « Theory of Linear and Integer Programming 

SPALL « Introduction to Stochastic Search and Optimization: Estimation, 
Simulation, and Control STIEBITZ, SCHEIDE, TOFT, AND FAVRHOLDT » 
Graph Edge Coloring: Vizing’s Theorem and Goldberg’s Conjecture 
SZPANKOWSKI « Average Case Analysis of Algorithms on Sequences 
TOMESCU * Problems in Combinatorics and Graph Theory (Translated by R. A. 
Melter) TUCKER * Applied Combinatorics, Second Edition 

WOLSEY * Integer Programming 

YE ¢ Interior Point Algorithms: Theory and Analysis 


Introduction to 
Combinatorics 


Second Edition 


Martin J. Erickson 
Department of Mathematics 
Truman State University 
Kirksville, MO 


WILEY 


Copyright © 2013 by John Wiley & Sons, Inc. All rights reserved. 


Published by John Wiley & Sons, Inc., Hoboken, New Jersey. 
Published simultaneously in Canada. 


No part of this publication may be reproduced, stored in a retrieval system or 
transmitted in any form or by any means, electronic, mechanical, photocopying, 
recording, scanning or otherwise, except as permitted under Section 107 or 108 
of the 1976 United States Copyright Act, without either the prior written 
permission of the Publisher, or authorization through payment of the appropriate 
per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, 
Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at 
www.copyright.com. Requests to the Publisher for permission should be 
addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River 
Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at 


http://www.wiley.com/go/permission. 


Limit of Liability/Disclaimer of Warranty: While the publisher and author have 
used their best efforts in preparing this book, they make no representation or 
warranties with respect to the accuracy or completeness of the contents of this 
book and specifically disclaim any implied warranties of merchantability or 
fitness for a particular purpose. No warranty may be created or extended by sales 
representatives or written sales materials. The advice and strategies contained 
herein may not be suitable for your situation. You should consult with a 
professional where appropriate. Neither the publisher nor author shall be liable 
for any loss of profit or any other commercial damages, including but not limited 
to special, incidental, consequential, or other damages. 


For general information on our other products and services please contact our 
Customer Care Department within the United States at (800) 762-2974, outside 
the United States at (317) 572-3993 or fax (317) 572-4002. 

Wiley also publishes its books in a variety of electronic formats. Some content 
that appears in print, however, may not be available in electronic formats. For 
more information about Wiley products, visit our web site at www.wiley.com. 
Library of Congress Cataloging-in-Publication Data is now available. 


ISBN 978-1-118-63753-1 


To my parents, Robert and Lorene 


PREFACE 


This book is an update and revision of my earlier textbook of the same title. The 
most important change is an increase in the number of worked examples and 
solved exercises. Also, several new topics have been introduced. But the overall 
plan of the book is the same as in the first edition: to introduce the reader to the 
basic elements of combinatorics, along with many examples and exercises. 

Combinatorics may be described as the study of how discrete structures can be 
counted, arranged, and constructed. Accordingly, this book is an introduction to 
the three main branches of combinatorics: enumeration, existence, and 
construction. There are two chapters devoted to each of these three areas. 

Combinatorics plays a central role in mathematics. One has only to look at the 
numerous journal titles in combinatorics and discrete mathematics to see that 
this area is huge! Some of the journal titles are Journal of Combinatorial Theory 
Series A and Series B; Journal of Graph Theory; Discrete Mathematics; 
Discrete Applied Mathematics; Annals of Discrete Mathematics; Annals of 
Combinatorics; Topics in Discrete Mathematics; SIAM Journal on Discrete 
Mathematics; Graphs and Combinatorics; Combinatorica; Ars Combinatoria; 
European Journal of Combinatorics A and B; Journal of Algebraic 
Combinatorics; Journal of Combinatorial Designs; Designs, Codes, and 
Cryptography; Journal of Combinatorial Mathematics and Combinatorial 
Computing; Combinatorics, Probability & Computing; Journal of 
Combinatorics, Information & System Sciences; Algorithms and Combinatorics; 
Random Structures & Algorithms; Bulletin of the Institute of Combinatorics and 
Its Applications; Journal of Integer Sequences; Geombinatorics; Online Journal 
of Analytic Combinatorics; and The Electronic Journal of Combinatorics. These 
journal titles indicate the connections between discrete mathematics and 
computing, information theory and codes, and probability. Indeed, it is now 
desirable for all mathematicians, statisticians, and computer scientists to be 
acquainted with the basic principles of discrete mathematics. 

The format of this book is designed to gradually and systematically introduce 
the main concepts of combinatorics. In this way, the reader is brought step-by- 
step from first principles to major accomplishments, always pausing to note 
mathematical points of interest along the way. I have made it a point to discuss 
some topics that don’t receive much treatment in other books on combinatorics, 
such as Alcuin’s sequence, Rook walks, and Leech’s lattice. In order to illustrate 


the applicability of combinatorial methods, I have paid careful attention to the 
selection of exercises at the end of each section. The reader should definitely 
attempt the exercises, as a good deal of the subject is revealed there. The 
problems range in difficulty from very easy to very challenging. Solutions to 
selected exercises are provided in the back of the book. 

I wish to thank the people who have kindly made suggestions concerning this 
book: Mansur Boase, Robert Cacioppo, Duane DeTemple, Shalom Eliahou, 
Robert Dobrow, Suren Fernando, Joe Hemmeter, Daniel Jordan, Elizabeth 
Oliver, Ken Price, Adrienne Stanley, and Khang Tran. 

I also gratefully acknowledge the Wiley staff for their assistance in publishing 
this book: Liz Belmont, Kellsee Chu, Sari Friedman, Danielle LaCourciere, 
Jacqueline Palmieri, Susanne Steitz-Filler, and Stephen Quigley. 


CHAPTER 1 


BASIC COUNTING METHODS 


We begin our tour of combinatorics by investigating elementary methods for 
counting finite sets. How many ways are there to choose a subset of a set? How 
many permutations of a set are there? We will explore these and other such 
questions. 


1.1 The multiplication principle 


We start with the simplest counting problems. Many of these problems are 
concerned with the number of ways in which certain choices can occur. 

Here is a useful counting principle: If one choice can be made in x ways and 
another choice in y ways, and the two choices are independent, then the two 
choices together can be made in xy ways. This rule is called the “multiplication 
principle.” 


EXAMPLE 1.1 


Suppose that you have three hats and four scarves. How many different hat 
and scarf outfits can you choose? 
Solution: By the multiplication principle, there are 3 - 4 = 12 different outfits. 
Let’s call the hats hy, hp, and hg and the scarves sq, s7, $3, and sy. Then we can 
Ai,81 fi,82 Ai,s3 Ai,s4 
he,8: he,82 he,s3 he, sa 


list the different outfits as follows: 3,81 hs,82 hs,83 ha, sa 


EXAMPLE 1.2 


At the French restaurant Chacun a Son Goat, there are three choices for the 
appetizer, four choices for the entrée, and five choices for the dessert. How 
many different dinner orders (consisting of appetizer, entrée, and dessert) 
can we make? 

Solution: The answer is 3: 4: 5 = 60, and it isn’t difficult to list all the 


possibilities. Let’s call the appetizers a1, a, and a3, the entrées e1, ep, e3, and 
eq, and the desserts dj, dp, d3, dg, and ds. Then the different possible dinners 

a, €1,d) a1, €1,d2 a1,€1,d3 a1,e1,d4 ai,é1,d5 

@1,€2,d) @1,€2,d2 @1,€2,d3  41,€2,d4 1, €2,d5 

a1,€3,d; a1,€3,d2 a1,€3,d3 a1,e3,d4 a1, €3,d5 

@1,€4,d, @1,€4,d2 @1,€4,d3  @,€4,d4 a1, €4, ds 

a2,€1,d) a2, €1,d2 Q2,€1,d3 a2,€1,d4 a2, @1,d5, 

@2,€2,d, @2,€2,d2 a2,€2,d3 a2,€2,d4 a2,€2,d5 

a2,€3,d, d2,€3,d2 a2,€3,d3  a2,e€3,d4 a2,e3,d5 

@2,€4,d, @2,€4,dz a2,€4,d3 a2,@4,d4 a2,é€4,d5 

a3,€1,d; @3,€1,d2 a@3,e€1,d3 a3,e1,d4 @3,€1,d5 

a3, €2,d) @3,€2,d2 @3,€2,d3 @3,€2,d4 a3, €2,d5 

a3, e3, dy ag, €3, d2 ag, €3, d3 a3,€3,d4 a3, €3, ds, 


are as follows: @3,€4,41 4@3,€4,d2 @3,€4,d3  @3,€4,d4 ag, €4, ds 


EXAMPLE 1.3 


A variable name in a certain computer programming language consists of a 
letter (A through Z), a letter followed by another letter, or a letter followed 
by a digit (0 through 9). How many different variable names are possible? 


Solution: There are 26 variable names consisting of a single letter, 262 variable 
names consisting of two letters, and 26 - 10 variable names consisting of a letter 
followed by a digit. Altogether, there are 26 + 26 + 26- 10 = 962 


variable names. 


EXAMPLE 1.4 Number of binary strings 


How many binary strings of length n are there? 
Solution: There are two choices (0 or 1) for each element in the string. Hence, 


there are 2” possible strings. 


For instance, there are 23=8 binary strings of lenght 3: 
000, 001, 010, O11, 100, 101, 110, 111. 


EXAMPLE 1.5 Number of subsets of a set 


Let S be a set of n elements. How many subsets does S have? 
Solution: There are two choices for each element of S; it can be in the subset or 


not in the subset. This means that there are 2” subsets altogether. 


For instance, let S = {a, b, c}, so that n = 3. Then S has 23 = 8 subsets: 
0, {a}, {b}, {c}, {a,b}, {b,c}, {a,c}, {a,b,c}. 


EXERCISES 


1.1 A person making a book display wants to showcase a novel, a history 
book, and a travel guide. There are four choices for the novel, two choices 
for the history book, and 10 choices for the travel guide. How many choices 
are possible for the three books? 

1.2 A license consists of three digits (0 through 9), followed by a letter (A 
through Z), followed by another digit. How many different licenses are 
there? 

1.3 How many strings of length 10 are there in which the symbols may be 0, 
Leor2? 

1.4 How many subsets of the set {a, b, c, d, e, f, g, h, i, j} do not contain 
both a and b? 

1.5 How many binary strings of length 99 have an odd number of 1's? 

1.6 How many functions map the set {a, b, c} to the set {w, x, y, z}? 

1.7 Let X be an n-element set. How many functions from X to X are there? 
1.8 Let X = {1, 2, 3,...,2n}. How many functions from X to X are there such 
that each even number is mapped to an even number and each odd number is 
mapped to an odd number? 


1.2 Permutations 


One of the fundamental concepts of counting is that of a permutation. A 
permutation of a set is an ordering of the elements of the set. 


EXAMPLE 1.6 


List the permutations of the set {a, b, c}. 
Solution: There are six permutations: 

abe, acb, bac, bea, cab, cba. 
We set 

(1.1) ni=1-2-3---n, nol; Ol=1. 


The expression n! is called n factorial. 

We see in the above example that the number of permutations is 6 = 3!. 

There are n! permutations of an n-element set. The reason is there are n 
choices for the first element in the permutation. Once that choice is made, there 
are n — 1 choices for the second element, and then n — 2 choices for the third 
element, and so on. Altogether, there are "(7 — 1)(n ~2)---3-2-1 
choices, which is n! 


EXAMPLE 1.7 


In how many ways can the letters of the word MISSISSIPPI be arranged? 
Solution: This is an example of a permutation of a set with repeated elements. 
There are 11! permutations of the 11 letters of MISSISSIPPI, but there is much 
duplication. 

We need to divide by the number of permutations of the four I’s, the four S’s, 
the two P’s, and the one M. Thus, the number of different arrangements of the 


11! 
. —~— = 34,650. 
letters is 4!4!2!1! ss 


Let S be an n-element set, where n > 0. How many permutations of k elements of 
S are there, where 1 < k < n? There are n choices for the first element, n — 1 
choices for the second element,...,n — k + 1 choices for the kth element. Hence, 
there are @(m — 1)---(n—k+1) 

choices altogether. This expression, denoted P(n, k), may be written as 

n! 
(1.2) P(n, k) nb O<k<n. 

(Notice that we now allow k = 0, which gives P(n, 0) = 1.) We can interpret this 
formula as a MISSISSIPPI-type problem. The selected elements may be denoted 
X1,----Xk, and the nonselected elements all denoted with the letter N (for 


nonselected). 


EXAMPLE 1.8 
An organization has 100 members. How many ways may they select a 
president, a vice-president, a secretary, and a treasurer? 


Solution: The number of ways to select a permutation of four people from a 


FXFRCISES 


BAIZABRAIEUALQ BJ EI 


1.9 A teacher has eight books to put on a shelf. How many different 
orderings of the books are possible? 

1.10 You have three small glasses, four medium-size glasses, and five large 
glasses. If glasses of the same size are indistinguishable, how many ways 
can you arrange the glasses in a row? 

1.11 A couple plans to visit three selected cities in Germany, followed by 
four selected cities in France, followed by five selected cities in Spain. In 
how many ways can the couple order their itinerary? 

1.12 A student has 10 books but only room for six of them on a shelf. How 
many permutations of the books are possible on the shelf? 

1.13 A librarian wants to arrange four astronomy books, five medical books, 
and six religious books on a shelf. Books of the same category should be 
grouped together, but otherwise the books may be put in any order. How 
Many orderings are possible? 

1.14 In how many ways can you arrange the letters of the word 
RHODODENDRON? 

1.15 How many one-to-one functions are there from the set {a, b, c} to the 
set {t, u, V, W, X, y, Z}? 

1.16 Let X be an n-element set. How many functions from X to X are not 
one-to-one? 

1.17 Find a formula for the number of different binary relations possible on 
a set of n elements. 


1.3 Combinations 


Another fundamental concept of counting is that of a combination. A 
combination from a set is an unordered subset (of a given size) of the set. 

For convenience, we sometimes refer to an n-element set as an n-set and a k- 
element subset as a k-subset. Also, we use the notation N = {1, 2, 3,...,} and Ny 
= {1, 2, 3,...,m}. 

Let S be an n-set, where n => 0. How many k-subsets of S are there, where 0 < k 
<n? We can regard this as a MISSISSIPPI-type problem, i.e., a problem of 
permutations with repeated elements. Let X denote selected elements and N 
denote nonselected elements. Then the number of combinations is the number of 


arrangements of k X’s and n — k N’s, since each such arrangement specifies a 
combination. 

Hence, the number of combinations, denoted C(n, k), is given by 

n! 

(1.3) oe) aaspt 
We call this expression “n choose k.” We set C(n, k) = 0 fork <0 andk > n. 

For example, with n = 5 and k = 3, we have C(5, 3) = S!/(3!2!) = 10 
combinations of three elements from the set S = {a, b, c, d, e}, as shown below 


with the corresponding arrangements of X’s and N’s: 
{a,b,c} {a,b,d}  {a,b,e} f{a,c,d}  {a,c,e} 
XXXNN XXNXN XXNNX XNXXN XNXNX 
{a,d,e} {bcd} {bce} {b,d,e} {c,d,e} 
XNNXX NXXXN NXXNX NXNXX NNXXX. 

The values of C(n, k) are given by a famous array of numbers known as 
Pascal’s triangle. See Figure 1.1. The triangle is created by starting with a 1 in 
the top row, placing 1's at the ends of each successive row, and adding two 
consecutive entries in a row to produce the entry beneath and between these 
entries. Thus, we can generate Pascal’s triangle from the initial values 


O<k<n. 


Figure 1.1 Pascal’s triangle. 


if 2c a0 3 2 
ft 6 Dba ES & 
1 #F "2 SE a 2 a 
1 8 28 56 70 56 28 8 1 
1 9 36 84 126 126 8 36 9 1 
1 10 45 120 210 252 210 12045 10 1 


(1.4) C(n, 0) = 1 and C(n,n)=1 foralln >0 
and the relation 
(1.5) C(n,k) = C(n-—1,k-1)+C(n-1,k), 1<ken-1. 


The rows of Pascal’s triangle arc numbered 0, 1, 2, etc. (from top to bottom), and 


the columns are numbered 0, 1, 2, etc. (from left to right). The entry in row n, 
column k of Pascal’s triangle is C(n, k). 


EXAMPLE 1.9 


Evaluate C(10, 5). 
Solution: We see that the 5th entry of the 10th row of Pascal’s triangle is 252. 
Hence C(10, 5) = 252. This means that there are 252 combinations of five 
objects from a set of 10 objects. 
Pascal’s triangle gives the coefficients of the expansion of a binomial, such as a 
+ b, raised to a power. For example, (@ + 6)* = a + 3a*b + 3ab* + b°, 
and we see that the coefficients 1, 3, 3, 1 constitute the third row of Pascal’s 
triangle. For this reason, the entries of Pascal’s triangle are called binomial 
coefficients. 


We often use the notation (";) for C(n, k). We may also write this binomial 


coefficient as (7 k,n-k)- Thus, we know (1.6) 
(5) =1 and (") =1 foralln>0 

0 n 
and 


ane) = (ena) +(e): 1<k<n-1. 


We also have the formula 


(1.8) ie) - wo O<k<n. 


Binomial theorem. For any numbers a and b (real or complex) and any 


(a +b)" = Se (j)an-tot 


nonnegative integer n, we have k=0 


Proof. We give a combinatorial proof showing that, for each k, where 0 < k < n, 


the coefficients of a—kpk 


n—k bk 


on the two sides of the equation are equal. The 
coefficient a on the left side is the number of ways of selecting n — k 


factors a + b that contribute a's to the expansion of (a + b)" (the other factors 
contribute b's). Such selections are choices of k unordered objects from a set of n 


objects; hence there are SLR of them. This number is the coefficient of a—kpk 
on the right side. 
We could also have proved the binomial theorem by mathematical induction 


using (1.6) and (1.7). 
EXAMPLE 1.10 


What is the coefficient of a!258 in the expansion of (a + b)20? 

Solution: By the binomial theorem, the coefficient is 
20 20! 

(3) = gol > 125,970. 
How can we find the expansion of a multinomial expression raised to a power, 
such as (a + b + c)109 The answer is given by the multinomial theorem. 

From the solution to the MISSISSIPPI problem, we know that the number of 
ways that n objects can be divided into groups of sizes ky, kp,...,ky, such that kz 


+ ky + +++ + ky =n, where order among and within groups is unimportant, is 
n! 

ki lk! +++ km! 

This expression, called a multinomial coefficient, is denoted by 


n 
Bi, ay usp Bet 


Multinomial theorem. In the expansion of (xj + x2 +--+ + Xp)", the 
coefficient of (v1 +22+--:+2m)", where the k; are nonnegative integers 
such that ky + kp+: + :+ky = =n, is the multinomial coefficient 


Tt 
ki, ko,...,km ; 


EXAMPLE 1.11 


Give the expansion of (a + b + cy. 
Solution: By the multinomial theorem, 


3_{ 3 3 3 2 3 2 
(a+b+c) =(5 0.0 a” + 21,0 a“b+ 1,2,0 ab 
a ' {> 3 
+ esate * ion ia il @n 
Ste 2% 3 3 
b? 2 bc? 3 
+(o93) eid cau see adie cnet 


=a* + 3a7b + 3ab? + b® + 3a7c + Gabe + 3b7c + 3ac? + 3bc? + c*. 
We can also think of a multinomial coefficient as an “ordered partition.” For 
instance, the multinomial coefficient oe is the number of partitions of the set 
{1, 2, 3,...,23} into three subsets, A, B, and C, where A has seven elements, B 
has ten elements, and C has six elements. 


EXERCISES 


1.18 A student decides to take three classes from a set of 10. In how many 
ways may she do this? 


1.19 Evaluate C(20, 10). 

1.20 Give the expansion of (a + p)10. 

1.21 What is the coefficient of a!9p19 in the expansion of (a + b)2? 

1.22 Give simple formulas for (1), (nz) and (n3). 

1.23 Explain, in terms of counting, the formula 

C(n,k) = a a). 

1.24 A pointer starts at 0 on the real number line and moves right or left one 


unit at each step. Let n and k be positive integers. How many different paths 
of k steps terminate at the integer n? 


1.25 Give the expansion of (a + b + c)4. 


1.26 What is the coefficient of x3y/ in the expansion of (x + y + 1)29? 
1.27 Show that the multinomial coefficient 


is equal to a product of binomial coefficients. 
1.28 Prove the following relations for multinomial coefficients: 


me n r 7h 
k, -—1 ko ki, ko —1, kg ky, ke, kg — 1 : 


(a = w)- “(ey 
(0,243) “Costs) 
(1.0.45) “(es) 
(a, 0) “(a) 


1.29 Prove the multinomial theorem. 


1.30 (a) How many paths in R2 start at the origin (0, 0), move in steps of (1, 
0) or (0, 1), and end at (10, 15)? 

(b) How many paths in R® start at the origin (0, 0, 0), move in steps of (1, 
0, 0), (0, 1, 0), or (0, 0, 1), and end at (10, 15, 20)? 


1.4 Binomial coefficient identities 


Looking at Pascal’s triangle (Figure 1.1), we see quite a few patterns. Notice that 
the triangle is symmetric about a vertical line down the middle. To prove this, let 
X be an n-set. Then a natural bijection between the collection of k-subsets of X 
and the collection of (n — k)-subsets of X (simply pair each subset with its 
complement) shows that the two binomial coefficients in question are equal: 


(1.9) (;) 7 (na) 


This identity also follows instantly from the formulas for ({) and (,",). 

Many identities can be proved both algebraically and combinatorially. Often, 
the combinatorial proof is more transparent. 

The rule that generates Pascal’s triangle (together with the values 
(5) = @) = D is known as Pascal’s identity. 


Pascal’s identity. 


n n—-1l n-1 
= <ken-—l. 
(e)= (ai) +(e) tesa 


Pascal’s identity has a simple combinatorial proof. The binomial coefficient 


(7) is the number of k-subsets of the set {1,...,n}. Each such subset either 
contains the element 1 or does not contain 1. The number of k-subsets that 
contain 1 is (21). The number of k-subsets that do not contain 1 is (","). The 
identity follows from this observation. 

The combinatorial proof of Pascal’s identity is more enlightening than the 


following algebraic derivation: 


n-1 n—1 (n —1)! (n—1)! 
eas +( k ) = £-—Din—-ki’ Mn—k—-1)! 
(n-1)!-k | (n—1)I-(n—k) 
Klin—k! kiin—b)! 
(n—1)!-(K+n-—k) 

k!(n — k)! 
(n—1)I-n 
k!(n — k)! 

n! 
kin —b)! 


amet 
EXAMPLE 1.12 Sum of a row of Pascal’s triangle 


The sum of the entries of the nth row of Pascal’s triangle is 2”. 


(1.10) » (i) eal 


Combinatorially, this identity says that the number of subsets of an n-set is 
equal to the number of k-subsets of the n-set, summed over all k = 0,...n. 
The identity also follows by putting a = b = 1 in the binomial theorem. 


EXAMPLE 1.13 Alternating sum of a row of 
Pascal’s triangle 

Evaluate }7¢_9(—1)* (7). 
Solution: We give three solutions. (1) Letting a = -1 and b = 1 in the binomial 


9 a ate 


k ~ )0forn>0. 


= ~1)* (i) _ yn 
theorem, we obtain (1.11) k=0 : 
(2) Here is a combinatorial proof. Consider the equivalent formulation 


Th Tm 
E> 2) 
This relation says that, for any n > 0, the number of subsets of X = {1,...,n} with 
an odd number of elements is equal to the number of subsets with an even 
number of elements. For n odd, this assertion follows trivially from the 


symmetry of the binomial coefficients. We give a combinatorial argument valid 
A = {SCX:|S|isevenand1¢€ S} 


B = {SCX : |S] is odd and 1 € S} 
C = {SCX :]|S| iseven and 1 ¢ S} 
foranyn>0O.LetP = {SO X:|S| isoddand 1 ¢ S}. 
The obvious bijections between A and PD and between B and C establish that |,A| 
= |p| and |B] = |c|, and hence |4| + IC] = |B) + |D). 
The identity follows immediately. 
(3) The identity can be turned into a telescoping series. For n > 0, we have 


di-vr(Z) = dev [P+ ("2") ] <2 


k=0 k=0 


EXAMPLE 1.14 Sum of squares of a row of 


Pascal’s triangle 
What is the sum $*7'_) a’? 
Solution: Let’s work out some instances of the sum using Pascal’s triangle: 
n=1: 1?=1 
n=2: 17417 =2 
n=3: 17+27+17=6 
n=4: 1°4+37+4+37+1? = 20 


n=5: 174+47+67+4?+41? =70. 
We recognize these sums as central binomial coefficients and conjecture that 


(1.12) 2 i) ? (7). 


Typically, the mathematical process consists of working example, looking for 
patterns, making conjectures, and proving the conjectures. Let’s try to prove our 
conjecture. 

We rewrite our conjecture as follows: 


))* OQ) *QG)+-+ QO) =): 


We know that the right side counts the ways of selecting n numbers from the set 
{1,2,3,...,2n}. Why is this counted by the left side? Rewrite just a little, using 


samen: \O) (a) * (a) la) * Ga) (a=a)*"*)o) = Ga) 


Now the truth of the identity is clear. The right side counts the number of n- 
subsets of {1,2,3,...,2n}. The left side counts the same thing, according to the 
number of elements that are chosen from the subset {1,2,3,...,n}. 

This identity has an interesting combinatorial interpretation. The binomial 
coefficient (*") is the number of northeast paths which start at the southwest 
comer of ann x n grid and stop at the northeast corner. Such paths are of length 
2n and are determined by a sequence of n “easts” and n “norths” in some order. 
The summation $7", py counts the paths according to their intersection with 
the main diagonal of the grid. The number of paths that cross the diagonal at the 
points i units east of the starting point is er, where 0 <i<n. 


Other binomial coefficient identities may be obtained by comparing like powers 
k 


of x in certain algebraic identities. For example, comparing coefficients of x‘ in 


the polynomial identity (x +1)" = (c+ 1)™(r +1)”, 


we obtain Vandermonde’s identity. 


Vandermonde’s identity. 


m+n ea in n 
Pe) BC): 
i=0 

Vandermonde’s identity has a combinatorial interpretation. The binomial 
coefficient ("{") is the number of k-subsets of the (m + n)-set A U B, where A = 
{1,...,m} and B ={m + 1,...,m + n}. The number of such subsets that contain i 
elements of A (and k — i elements of B) is (",)(,",), and the summation 
yo (”)(,”,) counts these subsets for i = 1,...,k. 

Letting m = 1, and changing n to n — 1, the relation becomes Pascal’s identity. 

(2)-£() 

Putting k = m = n, we obtain a previously seen identity: " i=o \" 
Here is another algebraic identity: 


(2 + 1)™t"*? = (2 +-1)--- (2 +1). 
oO" 
m+n+1 


The coefficient of xt! on the left side is ("*"*?), On the right side, there is a 


n+1 
contribution to x"*1! whenever we multiply x’s from n + 1 of the factors. 


Suppose that the rightmost factor which contributes an x is the (n + i + 1)st 
factor, where 0 <i<m. This leaves us free to choose n other x's from a set of n + 


i factors. Hence the coefficient of x7*1 on the right side is 7”, ("**). This 


mt+n+1\ _ ‘5 n+i1 
; ‘ n+1 il n /}- 
proves the identity (1.13) i=0 


The identity has a combinatorial interpretation. The binomial coefficient 
(™*n*) is the number of (n+1)-subsets of the (m+n+1)-set {1,...,m+n+1}. 
Suppose that the largest element in such a subset is n + i + 1, where 0 <i<m. 
The number of such subsets is ("**), and the summation $77", ("**) counts them 


all. 


Subcommittee identity. For 0 <j < k <n, we have 
(1G) = GG) 
kK} \G j/\k-5) 
Proof. Both expressions count the number of ways to choose, from n people, a 
committee of size k and a subcommittee of size j. 


EXAMPLE 1.15 


Prove the identity: 
mn n 
x «( ) 5 gana 
a < 
Solution: We will give four proofs. 
(1) The first proof is algebraic. We can “pull an n out” of each term in the sum 
to obtain 


n 


= n n! 
#(2) - ‘in — by 
k=] k=1 


7 = (n —1)! 
. 4 (k —1)\(n — k)! 


=n2*—?, 

(2) The second proof is by counting. Consider all possible ways of choosing a 
team and a team leader from a set of n people. The left side clearly counts this, 
according to the size k of the team. The right side counts the same thing, as we 
have n choices for the leader and each other person can be on or off the team. 


(3) The third proof uses calculus. From the binomial theorem, we have 
(e+1)"= a Bo 


k=0 
Taking a derivative “brings a k down,” so 


ad a m—1 _ ‘ n k=-1 
qg(e +)" =n(e +1) => (i) ee 


k=1 
Evaluating both sides of the last relation at x = 1 gives our desired identity. 
(4) Let’s also do a proof via probability. Upon division by 2”, our identity 
becomes 


= n rh mn 

> *(;) (3) =3 
Here is a probabilistic interpretation. Let X be a set of n elements. For each 
element of X, flip a fair coin and if the coin comes up heads put the element in a 


subset S. What is the expected size of S? Both sides of the identity give the 
answer! 


EXAMPLE 1.16 
Prove the identity 
3 4 : na = 2". 


k—0 
Solution: We give a counting proof of the equivalent identity 


> (" i *)2 Lgn-k _ 92nti 

The right side of this relation is the number of binary strings of length 2n + 1. 
We must show that the left side counts the same strings. Every binary string of 
length 2n + 1 contains at least n + 1 O's or at least n+ 1 1’s(but not both). 
Counting from the left, let n+ k+1, where 0 < k < < n, be the position of the (n 
+1)st 0 or (n +1)st 1. There are two possibilities for this element (0 or 1); there 
are [etry binary strings of length n + k that contain n of one symbol and k of the 


other; and there are 20-k choices for the remaining n — k elements. This 


establishes the identity. 


EXAMPLE 1.17 An object moving in the plane 


An object travels along the integer points of the plane, starting at the point 
(0, 0). At each step, the object moves one unit to the right or one unit up 
(with equal probability). The object stops when it reaches the line x = n or 
the line y = n. Show that the expected length of the object’s path is 
2n — n(2")21-2n, 


Solution: Assume that the object hits the line x = n at the point (n, k) or the line y 
= n at the point (k, n), where 0 < k < n - 1. Then the expected path length is 


n—1 n+k-1 
see ee! n+k-1 1 
sae (n+ k)2 5( n-1 ) (5) 
—1\" n+k—1\ (1\*" 
(a) Zeer) G) 
k=0 
n nl k—1 
1 n+k i 
(2) C2) @) 
k=0 
n n—1 fr k 
={a) >EC5)G) 
: 2 n 2 
given by k=0 
By the result of Example 1.16, this simplifies to 
1 ag —(n+k r\* 2n\ (1\" 
= (2) ™[("')@) -(2) @)| 
k=0 
= 2n — n(n) a, 
n 


The binomial theorem extends to arbitrary exponents. For any real number a@ and 


_Rea eis 
k a positive integer, define (1.14) k kt 
Also, define (§) = 1. 


EXAMPLE 1.18 
—3\ _ (—3)(—4)(—5)(-6) 
(3) AIMED 


Binomial series. Let a be a real number and |x| < 1. Then 


(1+2)*= 3 ae 


k=0 


EXAMPLE 1.19 


Let n be an integer greater than or equal to 1. Prove the formula 


isa = Soy as "a 


Solution: By ie binomial series theorem, 


atayr= > (Ql) eH 
k=0 
The result now follows from the identity (see Exercises) 


a (;") = (-1)* % + . - :" 


Pascal’s triangle extends upward, as in Figure 1.2. (In the figure, the triangle is 
left-justified.) Pascal’s identity, together with the initial values (5) = 1 for all 
integers n, is used to calculate entries (","), where n is a positive integer. The 


given by the binomial series theorem as the 


entries are the numbers (-1)k ara ‘) 


coefficients of x in the power series expansion of (1 + x)”. 


Figure 1.2 The extended Pascal’s triangle. 
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kL =—4@ NH -2) @ —56 
1 =-3 6 =—10 16 —2]1 
l1 -2 3 —4 5 —§ 
1 -1 ] —1 1 —] 
l 0 0 0 0 0 
1 1 0 0 0 0 
l 2 1 0 0 0 
l 3 3 1 0 0 
1 4 6 4 1 0 
1 5 10 10 5 1 


EXAMPLE 1.20 


Give the first several terms of the expansion of (1 + x) 4 in powers of x. 
Solution: We can see the coefficients in row —4 of the extended Pascal’s triangle. 
Thus 


(1+2)7* =1—42 + 10x? — 202° + 35a* — 562° +---. 


EXERCISES 
1.31 Prove the following identities: 
(a) (f) = 8 (Ra) 
(b) eg x (Teer "I 
1.32 Prove the identity 
_ n(n+1)(n+2) 


L-242-3+3-44---+n-(n+1)=—————. 


Generalize. 
1.33 Prove the following identities: 


(2) ) = Deo CE) 
(b) (3) = hao (k) F) 

1.34 (a) Prove the identity (7) = =-£**(,”,). 
(b) Use the identity of part (a) to show that the entries of each row of 
Pascal’s triangle increase from left to right, attain a maximum value at the 
middle entry (or two middle entries), and then decrease. 


1.35 Prove the inequality eal > (7) (era) Where 1 sk<n-1. 

1.36 Suppose that five particles are traveling back and forth on the unit 
interval [0, 1]. Initially, all the particles move to the right with the same 
speed. (The initial placement of the particles does not matter, as long as they 
are not at the endpoints.) When a particle reaches 0 or 1, it reverses direction 
but maintains its speed. When two particles collide, they both reverse 
direction (and maintain speeds). How many particle—particle collisions 
occur before the particles once again occupy their original positions and are 
moving to the right? 


1.37 Show that 2n people may be grouped into n pairs in (2n)!/(n!2") ways. 
1.38 How many ways can 3n people be grouped into n trios? 

1.39 How many ways can kn people be grouped into it subgroups of size k? 
1.40 Prove that the number of binary strings of length n that contain exactly 


& +1 ) 
k copies of the string 10 is \24 + 1/° 


1.41 Give the first several terms of the expansion of (1 + x 
x: 


yl/ 2 in powers of 


1.42 Give the first several terms of the expansion of (1 + x)? in powers of 
x: 


1.43 Give the first several terms of the expansion of (1 + x) 2 in powers of 
ne 


1.44 Prove the identity 
ntjtsk t® a 
3 3 n, j,k ; ae 
j=0 k=0 Js 
1.45 For each integer k = 0, define 


Give formulas for So(n), Sy(n), S2(n), and S3(n). Prove that S;(n) is a 
polynomial in n of degree k + 1 and leading coefficient 1/(k + 1). 

1.46 An n-dimensional hypercube consists of all binary n-tuples. Two such 
n-tuples are joined by an edge if they disagree in exactly one coordinate 
Prove that the number of k-dimensional faces of an n-dimensional hypercube 
is 


(ya O<k<n. 


1.5 Distributions 


Problems in which elements of a set are divided into categories are called 
distribution problems. Let’s consider a simple scenario. Suppose that five $1 
bills are to be distributed among three people. In how many ways can this be 
done? The answer depends on whether the people are to be considered as 
identifiable in some way, and the same goes for the dollar bills. For instance, 
suppose that the people are named Amy, Bobby, and Carly, and the dollar 
bills have serial numbers so they are identifiable. Then there are three 
choices for who gets the first dollar bill, three choices for who gets the 


second dollar bill, and so on. Altogether, there are 3° ways to distribute the 
five dollars to the three people. 

If the dollar bills are interchangeable, then we have a so-called “stars and 
bars” situation. The number of distributions is the number of ways to arrange 
five dollar signs (or stars) and two vertical lines (or bars) partitioning the 
three people along a line. The number of ways is C(7, 2). 

If the three people are anonymous but the five bills are numbered, then the 
number of distributions is given by the Stirling numbers of the second kind. 
We will see more about this later in this section. 

If the people are anonymous and the bills are interchangeable, then we 
have what are called partition numbers. We will sec mom about this later, 
too. 


EXAMPLE 1.21 


How many solutions in nonnegative integers are there to the equation 
%1+29+273=10? 
Solution: We can think of the 10 on the right side of the equation as 
representing 10 units that can be distributed to the three variables, x1, x9, 


and x3. Such a distribution can be pictured with a linear ordering of 10 *’s 


(to represent the units) and two vertical lines (to indicate the partitioning of 
the units among the variables). For instance, the solution 3 + 2 + 5 = 10 is 
shown as * * *| **|*****- 

Thus, finding the number of distributions is a MISSISSIPPI-type problem. 


As there are 12 symbols altogether (10 *’s and two vertical lines), the 
12! 


number of solutions is 10!2! — 


Distribution of identical objects into distinguishable classes. The number of ways to 
distribute k identical objects among n distinguishable classes is Ge ih This is the same as 


the number of nonnegative integer solutions to “1 +2g++:++in =k. 
By contrast, the number of ways to distribute k distinguishable objects into 


n distinguishable classes is nk, 


A partition of X is a collection C of nonempty pairwise-disjoint subsets of 
X whose union equals X. The members of C are called the parts of the 
partition. 

An equivalence relation on X is a relation on X that is reflexive, 
symmetric, and transitive. If R is an equivalence relation on X, then, for each 
a€ X, the set [a] = {be, X : (a, b) € R} is the equivalence class of a. 


Equivalence of equivalence relations and partitions. Let X be a nonempty set. The 
equivalence classes of an equivalence relation on X are the parts of a partition of X. 
Conversely, the parts of a partition of X are the equivalence classes of an equivalence relation 
on X. 


Proof. Given an equivalence relation R on X, we will show that C = {[x]: x¢ 
X} is a partition of X. First, each member [x] of C is nonempty (it contains 
x). Second, the union of the members of C is all of X, since each element x € 
X is contained in a member of C, namely, [x]. Third, the members of C are 
disjoint. For suppose that [x]  [y] is nonempty for some x, y € X; assume 
that z € [x] n (y). Then, since (x, z) € R and (y, z) € R, it follows by symmetry 
and transitivity that (y, x) € R. Let x’ be an arbitrary element of [x]. Then, 
since (x, x’) € R, it follows by symmetry and transitivity that (y, x') € R, and 
hence x’ € [y]. Since x’ is an arbitrary element of [x], we conclude that [x] € 
[y]. A similar argument shows that [y] € [x] and therefore [x] = [y]. 

Now suppose that C is a partition of X, and define a relation R on X so that 
(x, y)€ Rif x, ye C for some CEC. We will show that R is an equivalence 
relation on X. Since C is a partition of X, each x € X is an element of some 
member of C; hence R is reflexive. If x and y are both elements of some 
member C of ¢, then the same can be said of y and x; hence R is symmetric. 
As for transitivity, if x and y are both elements of C for some C € ¢, and y 
and z are both elements of D for some D € ¢, then C =D (since the parts of a 


partition are disjoint). Hence, x and z are both elements of the same member 
of C. 


EXAMPLE 1.22 
How many partitions of the set {1, 2, 3, 4} are there? 


Solution: In a partition of a set of four elements, the sizes of the equivalence 
4 


3+1 
2+2 
2+1+1 
classes sum to 4. There are five possibilities for these sizes: 1+ 1+1+1. 
For example, the partition 
{{1,2}, {4}, (5}) 
is of type 2 + 1 + 1. It is an easy matter to count the partitions of each type, 
obtaining, respectively, 1, 4, 3, 6, and 1, for a total of 15. 
The nth Bell number, denoted B(n), is the total number of partitions of the 
set (1, 2, 3,...,n). The Stirling number of the second kind {7} is the number 
of partitions of {1, 2, 3,...,n} into k equivalence classes. The partition 
number p(n) is the total number of partitions of a set of n indistinguishable 
elements. These are also called partitions of an integer. 
According to the above example, 
B(4) = 15, {7} = 1, {3} = 7, {3} = 6, {4} = 1, and p(4) =5. 
We also define p(n, k) to be the number of partitions of n indistinguishable 
objects into k parts. From the above example, p(4, 1) = 1, p(4, 2) = 2, p(4, 3) 
= 1, and p(4, 4) = 1. 


EXERCISES 


1.47 You can order a pizza with up to four toppings (repetitions 
allowed) from a set of 12 toppings. The order of the toppings is 
unimportant. How many different pizzas can you order? 


1.48 In how many ways may k indistinguishable balls be placed in n 
distinguishable urns so that each urn contains an odd number of balls? 
1.49 (a) Find a formula for the number of functions f: Ny > Np with 
the property that f(x) < f(y) whenever 1 <x<y<m. 
(b) Find a formula for the number of functions f: Ny > Np with the 
property that f(x) < f(y) whenever 1 <x<y<m. 


1.50 Find {9}, {3}, {3}. {3}. {2}-and BG). 
1.51 Find p(5, 1), p(5, 2), p(5, 3), p(5, 4), and p(5). 
1.52 Prove that {3} = 2"-!~1and {,",} = (9) forn> 2. 
1.53 Determine the number of nonnegative integer solutions to the 
equation 

a+2b+4c = 10%. 
1.54 Let S(n) = |{(ki,...,km): m,ki EN, O™, ki =n} Find with 
proof a formula for S(n). Note that S(n) counts the number of ways n 
may be written as n = kj +--+ + ky for any m (order important). Such 
summations are called compositions of n. 
1.55 How many commutative groups of order one million are there? 


1.6 The principle of inclusion and exclusion 


The inclusion—exclusion principle is a generalization of the familiar Venn 
diagram rule. 


Venn diagram rule. If A and B are finite sets, then 


(1.16) [AU Bl = |A| + |B] — |AN BI. 


Proof. See Figure 1.3, which shows two sets, A and B, and their union and 
intersection. The sum |A| + |B| counts all the elements of A U B, but the 
elements of A n B are counted twice and therefore must be removed as on 
the right side of (1.16). 


Figure 1.3 A Venn diagram for two sets. 
AUB 


Inclusion-exclusion principle. If Aj,...,A, are subsets of a finite set S, then (1.17) 
nm 


JA, U-+-U An] = So(-1)*? $0 Ag, 9+ Arid, 
i=] 


where the second sum is over all i-tuples (kj,...,k;) with 1 < ky <-+-<kj<n. 


Proof. Let s € S and assume that s is contained in exactly m of the Aj. The 
contribution of s to the right side of (1.17) is 0 if m = 0. If m = 1, then the 


yon ("’) = 5-1)" ey (because m < n) 


i=1 i=] 


m i 
(-1) bs (")(-) -1| 
(-1)[(-1+1)" - 1] 
contribution is = 1. 
Therefore, each s € S not in the union of the A; contributes zero to both sides 
of (1.17), while each s € S in the union contributes 1. This means that each 
element of S contributes an equal amount to both sides of (1.17); hence, 
(1.17) is a valid relation. 


EXAMPLE 1.23 Derangements 
A permutation with no fixed points is called a derangement. Let dy, be 
the number of derangements of n elements. Find a formula for dp. 
Solution: For 1 <j < n, let Aj be the set of permutations of {1, 2, 3,...,n} 
such that j is a fixed point. Then the intersection of any i of the Aj forl<i< 


n, has (n — i)! elements, for the n — i not necessarily fixed elements may be 
permuted arbitrarily. Since there are (") choices for the Aj that make up the 


intersection, by the principle of inclusion and exclusion we have 
n 


|A,U+--UA,] = Da i (") (n— ijl. 


We conclude that 


(1.18) i=0 


Inclusion—exclusion principle (probability version). Let Ej,..., E, be events in a finite 


sample space. Then 
7% 


Pr(E, U---UEn) = o(-1)*? >} Pr(Ek, 9---M Ex,); 


§=] 
where the second sum is over all i-tuples (ky...,k;) with 1 < ky <-+-<kj;<n. 


EXAMPLE 1.24 Stirling numbers of the second kind 
Find a formula for the Stirling number of the second kind {7}. 


Solution: Using the principle of inclusion and exclusion (see Exercises), we 
can obtain 


k 
3 =5 are l<k<n. 
* 7=0 


J 


EXAMPLE 1.25 Cards 


All 52 playing cards arc dealt randomly to four players, 13 cards per 
player. What is the probability that at least one person has all cards of 
the same suit? 
Solution: For 1 <i < 4, let E; be the event that player i has all cards of the 
same suit. By the principle of inclusion and exclusion, the desired 
probability, Pr (U Ej), is 


(‘) 4 (;) (5)2! + (3) (3)3! -(%) (4)4! 
VW (is) \27 (Cis) Ga) \87 Gis) Ga) Ga) \47 (23) Gis) Ga) Gia) 
= 18772910672458601 /745065802298455456100520000 


= hax 10>. 
Make sure you understand how the four terms in the first line are obtained. 


EXAMPLE 1.26 “The problem of derangements” 
What is the probability P,, that a random permutation of n elements is a 
derangement? 

Solution: We found in Example 1.23 that 


(iis) aa 
It may seem strange that a fixed point is less likely to occur when n is 52 


: th as ; li P. =e! =0.37. 
than when n is 51 or 53. It is interesting to note that aa 0.37 


Students of probability should not be surprised to see the appearance of the 
number e in a probability calculation. 


EXAMPLE 1.27 Average number of fixed points of a 
permutation 


Fine the average number of fixed points of a permutation of n elements. 
Solution: We illustrate the result in the case n = 3. Below are the 
permutations of {1, 2, 3} and the number of fixed points of each. 
permutation number of fixed points 
(1)(2)(3) 3 
(123) 1 
(2)13) 1 
(3)2) 1 
(123) 0 
(132) 0 


The permutations an written in cycle form. For instance, the permutation (2) 
(13) is the one that maps 2 > 2, 1 > 3, and 3 — 1. The total number of 
fixed points is 6, and the average number is 6/6 = 1. 

Randomly choose a permutation of {1, 2, 3,...,.n}. For 1 <i <n, define X; 
= 1 if iis fixed and 0 otherwise. Then the number of fixed points is Y = X_ + 


X9 +++++Xp. The expected value of each X; is (n — 1)!/n! = 1/n. Hence, the 


expected number of fixed points is 
ae 1 1 

E(Y) = B(X:) + E(Xg) +--+ + B(X_) = = fa fe po ene el, 

(Y) (X1) + E(X2) +--+ + E(Xn) Sa ae Ne 


EXAMPLE 1.28 Bell numbers 


Find a formula for the Bell number B(n). 
Solution: Using the result of Example 1.24, we obtain 


n k i 
Bin) = Daye) nen 
k=1"" j=0 J 


The formula can be simplified considerably: 


This formula is interesting from a number-theoretic point of view, as it is not 
at all clear a priori that (1/e) }2j~93"/J! is an integer. 

The inclusion—exclusion principle can be generalized to the Bonferroni 
inequalities of probability theory. We start with the algebraic identity 
(1+a2)"7) =(14+2)"(1—2+2?—2*+---). 


k 


Equating coefficients of x“ on both sides of this identity yields 


ake > 3 (7 )(y 


The above identity can also be proved by applying Pascal’s identity to the 
binomial coefficient (“) and collapsing the resulting telescoping sum. 


For each 1 <i<n, let 


a21 M= Dla 0 Aa 
where the sum is over all i-tuples (k1,...,kj) with 1 < ky <-+-+<kj<n. 


Bonferroni inequalities. Let Aj,...,A,, be subsets of a finite set S. If t is an odd number, then 


t 
JA. U-+-U An| < SO(-1) Ni. 
i=l 
If tis even, then the inequality is reversed. 


Proof. Let s € S and assume that s is contained in exactly m of the Aj. If m = 


0, then the contribution to both sides of the inequality is 0. For m > 0, the 
result follows easily from (1.20). 


EXAMPLE 1.29 


If k is even, we have 


If k is odd, then the inequality is reversed. 
Here is a neat technique that everyone should learn. Suppose that the 
sequence 
7, 11, 25, 73, 203, 487, 1021, 1925, 3343, 5443, 8417,... 
represents the values of a polynomial p(n), where n = 0, 1, 2,.... What is the 
polynomial? 
We take differences of consecutive terms, creating a new sequence: 
4, 14, 48, 130, 284, 534, 904, 1418, 2100, 2974, .... 
We repeat this process, creating a sequence of sequences: 
7, 11, 25, 73, 203, 487, 1021, 1925, 3343, 5443, 8417, 
4, 14, 48, 130, 284, 534, 904, 1418, 2100, 2974, 
10, 34, 82, 154, 250, 370, 514, 682, 874, 
24, 48, 72, 96, 120, 144, 168, 192, 
24, 24, 24, 24, 24, 24, 24, .... 
Having obtained a constant sequence, we stop. 
Now, we find the polynomial by multiplying the first column of our 
difference array by successive binomial coefficients and adding: 


p(n) = (5) +4(") + 10(5) +24(3) +24(7) =n —In>+4n? +n+7. 


This polynomial gives the original sequence, starting at p(0). 
Why does this work? Suppose that the polynomial is 
x(x — 1) a(x—1)...(e -—k +1) 
i a 
where the a; are real numbers. A little reflection shows that we can really 


p(x) = ap + a2 + a2 


write an arbitrary polynomial in this way. 

Suppose that the values of the polynomial are 

p(0), p(1), p(2), p(3), p(4), p(5), ---. 
Letting n = 0, we find that ag = p(0) (since all the other terms in the 
polynomial are equal to 0). 

The sequence of differences is 

p(1) — p(0), p(2) — p(1), p(3) — p(2), p(4) — p(3), --- 
Letting n = 1, we find that 

p(1) = ao + a1, 
and hence 

a; = p(1) — ao = p(1) — p(0). 

The next sequence of differences is 


p(2)—2p(1)+p(0), p(3)—2p(2)+p(1), p(4)—2p(3)+p(2), p(5)—2p(4)+p(3), ..-. 
Letting n = 2, we obtain 

p(2) = ag + 2a) + a2, 
and hence 

a2 = p(2) — 2p(1) + p(0). 

This pattern continues, so that the sequence ag, aj, ap,... is the first 
column of our difference array. In order to establish this, we introduce a 
little notation. Define 4p(”) = p(n + 1) — p(n). 


We call A the difference operator. We define A2p(n) = A(Ap(n)), A3p(n) = 


A(A2p(n)), and SO on. We have 
Ap(n) = p(n + 1) — p(n) 

A?p(n) = p(n + 2) — 2p(n + 1) + p(n) 

A’p(n) = p(n + 3) — 3p(n + 2) + 3p(n + 1) — p(n) 


k 
ei: 
Atp(n) = 0 + (F\p(n +3). 
The array of differences looks like 


p(O), pl), pl2), p(s),  p(4). —p), 


Ap(0), Ap(1), Ap(2), Ap(3), Ap(4), Ap(5), 
A?p(0), A?p(1), A?p(2), A?p(3), A?p(4), A?p(5), 
A*p(0), A%p(1), A%p(2), A%p(3), A*p(4), A*p(5), 
A*p(0), A‘p(1), A*p(2), A*p(3), A*p(4), A*p(5), 


So our claim is that 


n k 
= " _1yitk (*\ 7: 
vio) = (7) ee (pe. 
For a fixed i, the coefficient of p(i) is 


ze (t)(7) 


From the subcommittee identity, we obtain 


n -(n—?t 
“()ucrs) 
1 forn=i 
= f otherwise. 
This completes the argument 


Theorem. For any n and k, we have 


$3 () (2) an 


where 6(n, k) = 1 ifn =k and 6(n, k) = Oifn#k. 


EXERCISES 


1.56 (a) Find a formula for the number of surjective (onto) functions, 
T(m, n), from {1, 2, 3,...,m} to {1, 2, 3,...,.n}, where m =n. 

(b) Find a formula for the Stirling number of the second kind {7}. 
1.57 Euler’s @-function is defined as follows: 

o(n) = |{1< a2 <n: ged(z,n) = 1}F. 
Find a formula for @(n) in terms of the prime factorization of n. 
1.58 Prove that 


an = > >(-1)* (i) by 
k=0 


if and only if 


-_ a es k TL 
bn ag 1) (j, ae 
1.59 (Mobius inversion formula) (a) Prove that if 


Yn kA )= Yatra (k, 7) = (n, 3), 


then f( (n) = Den a(n, Kg (k) if and only if g(n) = S32, B(n, k) f(K). 

(b) Let a(n, k) = 1 if k | n and 0 otherwise. Determine [(n, k). 
1.60 What polynomial produces the sequence 

L438, 246778, 156... 7 
1.61 We say that two sets A and B are linked if AN B 4 @ and neither A 
nor B is a subset of the other. If S is an n-element set, how many pairs 
(A, B) of subsets of S exist with A and B linked? 


1.7 Fibonacci numbers 
Let’s discuss one of the most famous sequences of numbers, the Fibonacci 
sequence. The Fibonacci sequence {Fo, F1, F2,...} is defined recursively by 
the initial values (1.22) fo = 0, Fi = 1 

and the recurrence relation 

(1.23) Po feat rio Torn 2. 
Thus, the Fibonacci numbers are 

0, 1, 1, 2, 3, 5, 8, 18, 21, 34, 55, 89, 144, 233, 377, 610, . 

Fibonacci numbers count many things. For example: 


e F +1, is the number of ways that ann x 1 box may be packed with 2 x 


1 and 1 x 1 boxes. 
e Fy+2 is the number of binary strings of length n that do not contain the 


substring 00. 
e Fy+2 is the number of subsets of N, that contain no two consecutive 


integers. 
e Fy_1 is the number of compositions of n that do not contain 1's (see 


Exercise 1.54). 
Let’s prove the second of these formulas. 
Let s, be the number of binary strings of length n that contain no 00. We 


will prove that sy = Fn+2 for n = 1. Observe that sy = 2 = F3 and sj) =3 = 
Fy. We will show that sp = Sp_1 + Sp_2 for n = 3 (the same recurrence 


relation satisfied by the Fibonacci numbers). Notice that each binary string 
of length n that does not contain 00 ends in either 1 or 10. The number of 


such strings of the first type is sj; and the number of such strings of the 
second type is Sj_9. Hence sp = Sp_1 + Sp_2 for n = 3. Now, since {sp} 
satisfies the same recurrence relation as the Fibonacci numbers, and s1 = F3 
and sj = Fg, it follows by mathematical induction that sj) = Fy+2 for all n= 
Ig 


EXAMPLE 1.30 Cassini’s identity 
Prove that F2,, — Fp_4 Fn+1 =(—D™! forn2 1. 
Solution: We will prove the result by mathematical induction. The identity 
holds for n = 1, since F21 — FoFo =1—0O0=1= (—1)2. Assume that it 
F?.. — FaFn+2 = F3,, — Fa(Fn t+ Fast) 
= Fryi(Fnii1 - Fn) - F? 
= FatiFn-1 - Fr 
holds for n. Then =(-1)""?. 


Hence, the formula holds for n + 1 and by induction for all n > 1. 
Here is a delight from Pascal’s triangle. 


Singmaster’s theorem (1975). There are infinitely many numbers that occur at least six times 
in Pascal’s triangle. 


Proof. Suppose that we have a solution to 
n n—-1 
ae 
given by 
m = Fop-1 For, 2 = ForForsi, k > 2. 


The number r in such a solution occurs (at least) six times in Pascal’s 
triangle: 


(1) = (423) = (nna) = (n-mas) = Cm ) = (2m) 


The following relations are equivalent: 


nm n—! 
(., - 1) ~ ( m ) 
n! (n —1)! 
nm =(n—m-+1)(n—m) 
Fp —1F 2k For Fors = (For For+i — For—1F 2x + 1)(ForFor+1 — For—1F 2k) 

= [For(Foe+1 — Fox-1) + 1[Foe(For+i — For-1)] 
= (Fy, + 1)F5, 

Foy 1 For = Fi, +1. 

The last relation is true by Cassini’s identity. 
The smallest such number given by our proof (when k = 2) is 3003. 


EXERCISES 
1.62 Prove the identity 
Fite: + Fy = Foye-1, n>. 
1.63 Prove the identity 
Fe+.--+F2? =F, Fay, n>1. 
1.64 Prove the identity 
Kaif a eins 1 ee: 
1.65 Where do you find Fibonacci numbers in Pascal’s triangle? What 
identity proves this? 
1.66 Find positive integers n and k, with k <n, for which 


(2) +(e) = (era) 


1.67 Prove that 


co 

oe tan” | glee. == 
ar Fong1 4 

1.68 Find the least number greater than 1 that occurs six times in 


Pascal’s triangle. 


1.8 Linear recurrence relations 


A sequence {ap} satisfies a linear homogeneous recurrence relation with 


an = >. CiQ@n-i 
constant coefficients (of order k) if (1.24) i=1 
for constants c1,...,ck and all n= k. 
The Fibonacci sequence {F;} satisfies a linear homogeneous recurrence 
relation with constant coefficients (of order 2). 
How fast do the Fibonacci numbers grow? One might guess that they grow 


exponentially and they do. In order to find the exact rate of growth, we first 
find an explicit formula for Fp. 


We will show how to guess and construct a solution. Assume that x", 
where n = 0, is the general term of a sequence that satisfies the Fibonacci 
recurrence relation (but not necessarily the same initial conditions). Then 
mt — »wr—l pni—2Z 
r= +z ‘ 

Assuming that x # 0, we divide through by x” and obtain the equation 

(1.25) 27 -2-1=0. 


This polynomial x2 — x — 1 is called the characteristic polynomial of the 
sequence. We use the quadratic formula to find the two roots of the 


ai 1+ V5 go te V5 
characteristic polynomial: 1.26)" 2 °° 2 | 


We call @ the “golden ratio.” Note that @ = 1.6 and g = —0.6. 
So we know that ¢” and o” both satisfy the Fibonacci recurrence relation. 


Any linear combination Ad” + Bd", with A, B € R, also satisfies the 


recurrence ; relation. : For 
(Ag”! i B¢"~?) sf (Ad”~? ue Bo"~?) - A(¢""? af. go”) of B(o"! a or) 
= Ag” + Bd”. 


We use the initial conditions to solve for the coefficients A and B. 
Recalling that Fg = 1 and Fy = 1, we obtain two linear equations to solve 


1=A¢?+ Bd°=A+B 


i= asl Bg = A( 18) 4 (158), 


simultaneously: 
We find that 


| 1 
A= wend Bs, 
V5 V5 


Thus, a formula for the Fibonacci numbers is 


1 Tt 
F, = ie n>0. 

(1.27) V5 
The above function satisfies the recurrence relation and initial conditions 
of the Fibonacci sequence, and hence is a formula for the Fibonacci 
sequence (since the sequence is well-defined). But the derivation of the 
formula was based on the assumption that some basic solutions to the 
recurrence relation were exponential. How did we know this in advance and 
would it be true for other linear recurrence relations? A more direct way to 
solve these problems is via generating functions, which we address in 

Chapter 2. 

Now, how do we evaluate the growth rate of F,? We say that a positive- 
valued function f(n) is asymptotic to another such function g(n), and we 
write f(n) ~ g(n), if limp — oo f(n)/g(n) = 1. Since g" > 0 as n > ©, we 


ATh 


Fr ~ —=}- 
conclude that (1.28) v5 


EXAMPLE 1.31 
Find an explicit formula for the sequence {a,,} defined by the recurrence 
formula @0 = 1, a; = 1, an = 6an-1 — 9an—2, n > 2. 
Solution: The characteristic polynomial of the sequence is 
x? —6r+9 = (x —3)?, 
which has 3 as a double root. Hence, 3” is a solution to the recurrence 
relation. However, we need a second solution in order to make the formula 


satisfy the initial conditions. A guess for a second solution is n3". Let’s 
verify that this solution — satisfies the recurrence _ relation: 
6(n — 1)3"—! — 9(n — 2)3"-? = 3"-7(18n — 18 — 9n + 18) 
= "3". 

Any linear combination of our two solutions also satisfies the recurrence 
relation: 

A3” + Bn3”. 
In order to satisfy the initial values, ag = 1 and aj = 1, we require that 
1=A 
1=3A+ 3B, 


and hence A = 1 and B = —2/3. Therefore, an explicit formula for the 
sequence is 

Qn = 3" —2n3""!, n>O. 
The next example illustrates the technique of adding a particular solution 
and a homogeneous solution. 


EXAMPLE 1.32 


Find an explicit formula for the sequence {a,,} defined by the recurrence 
formula @o = 1, @) = 1, Qn = 6€n-1 — 9Gn_2 +N, n > 2. 
Solution: We find a particular solution to the recurrence relation. Assume 
the existence of a solution of the form a, = an + f, where a and f are 
an+ 2 = 6(a(n —1)4+ 8) — 9(a(n —2)+ 8) +n 
constants. Thus (4@ — 1)n = 12a — 4f. 
In order for this identity to hold for all n, we must have a = 1/4 and hence B 
= 3/4. 
Therefore 
1 3 

a 2 
satisfies the recurrence relation. 

We solved the homogeneous version of this recurrence relation in the 
previous example. Thus, = Ee solution to the recurrence relation is of 
aces A3” + Bn3” + q°t7- 

The initial values, ag = 1 and aj = 1, determine the values A = 1/4 and B = 
—1/4. 
Therefore, an explicit formula is 
a, = 33" — and" + on+ 3, n> 0. 
The Lucas numbers are defined as 
(1.29) Lo = 2, Ly = 1, Ln = Ln-1+Ln-2, 12 2. 
Thus, the Lucas numbers are 

2, 1, 3,4, 7, 11, 18, 29, 47, 76, 123, 199, 322, 621, 843, 1364, .... 

« Ly is the number of ways that an n x 1 box may be packed with 2 x 1 
and 1 x 1 boxes, allowing “wrap-around.” 
» Ly is the number of subsets of {1,...,.7} which do not contain two 


consecutive numbers, where 1 and n are considered consecutive. 
Since the Lucas numbers satisfy the same recurrence relation as the 


Fibonacci numbers, they have the same characteristic polynomial, xe = X= 
1. Taking into account the initial values Lg = 2 and Lj = 1, we obtain a 


formula for the Lucas numbers: (1.30) fn = "+ ¢", n2 0. 
The simplicity of this formula is one of the nice properties of the Lucas 
sequence. A consequence is that the Lucas numbers are given by the elegant 


formula Ly = {c"} for n = 2, where {x} is the nearest integer to x. 


EXAMPLE 1.33 Squares of Fibonacci numbers 


Let {F 2} be the sequence of squares of the Fibonacci numbers. Find a 
linear recurrence relation with constant coefficients for this sequence. 
Solution: Start with the relations 
Fy, = Fy-1 + Fr-2 
F,-3 = Fn-1 — Fy-2- 
Square both relations and add: 
F? 7 Fe = (Fy 7 Pie) + (Fy-1 a me 
2 2 
= 2F,,-1 + 2F;_2. 
We obtain the recurrence relation 
2 2 2 2 
Pe = 2Fy-1 + 2Fy 2 — Fa_-3: n> 3. 


T 


EXAMPLE 1.34 Powers of Fibonacci numbers 
Find a linear recurrence relation with constant coefficients for the 
sequence {Fk} of kth powers of the Fibonacci numbers, where k is a 
positive integer. 
Solution: The method is to use characteristic polynomials. Let’s work out 
the k = 2 case first (this will reproduce the recurrence relation found in the 
previous example). We know a direct formula for the Fibonacci numbers: 
F, = Ao" + Bd", n>0, 
where A and B are constants (we know the constants but don’t need them). It 
follows that Fz = A?(¢?)” + 2AB($¢)" + B?(”)”, 
and since ¢@ = —1, the roots of the characteristic polynomial for the 
sequence {F?} are g?, ¢%, and -1. Hence, the characteristic polynomial for 


this sequence is (7 — ¢”)(x — d?)(a +1) = (2? - (¢? + d?)a + I](w+ 1). 
To simplify further, recall that the Lucas numbers Ly are given by the 
formula Ln = "+4", n2>0. 


Using this formula, the characteristic polynomial in the case k = 2 simplifies 
to 


(a? — Lox + 1)(2 +1) = (2? — 82 +:1)(2 +1) = 2* — 22? —- 27 +1. 
This confirms the recurrence relation found in the previous example: 

F? = 2F?_,+2F?_,.-F*%_;, n2>3. 

The case k = 3 is similar. By the binomial theorem, the formula for F 3 
contains powers of ¢°, 626 = —¢, od? = — and d?. Therefore, the 
characteristic polynomial of the sequence { F3} is 
(x — $*)(x — 9°) (a + b)(x + 6) = [2” — (9° + a — Ila? + ($+ da 

= (x? — Lax — 1)(2? + Ly —1) 
= (z* — 4x — 1)(z? +. 2-1) 
= g* — 3c° — 627 + 32 +1. 
This gives us a recurrence relation for the cubes of the Fibonacci numbers: 

F3 = 3F3_,+6F3_,-—3F°_,-—Fe_,, n2>4. 

For k = 1, the characteristic polynomial for the sequence of kth powers of the 


Fibonacci numbers is 
[+] 1 if k mod 4 = 1,3 

[I (2? + (-i)**"La-2ir + (-1)*]- 4 (@ —1) if k mod 4 =0 

—_ (c+1) ifkmod4=2 


This formula, found by John Riordan, means that the sequence {Fk} of 


kth powers of the Fibonacci numbers satisfies a linear recurrence relation of 
order k + 1 with integer coefficients. 


EXERCISES 
1.69 Let {a,} be defined by the recurrence 


qq 0) 0 =, Ge: Bien Gi 3, 8d: 
Find an explicit formula for ap. 


1.70 Suppose that the sequence {a7} satisfies the recurrence relation 


Gn = 3an-1 + 4@n-2—12an-3, n> 3, 


where ag = 0, aq = 1, and az = 2. Find an explicit formula for ap. 
1.71 Let {b,} be defined by the recurrence 
bp = 0, b; = 0, b2 = 1, by = 4by_) — bn-2 — 6by-3, n > 3. 
Find an explicit formula for by. 
Do the same where the initial values are bg = 0, by = 1, bp = 2. 
1.72 Define {a,,} by the recurrence 


ag = 0, a, = 1, dn = 5An_1 — Gan-2, N > 2 
and {by} by the recurrence 


bo = 0, by = i bn = Obn—1 = 20b,, 2, Tm = a: 
Find a linear recurrence for the sequence {cp} defined by 


Cay =Qntha, n20. 
Find a linear recurrence for the sequence {d,,} defined by 


d, =Gnbn, n> 0. 
1.73 (a) Find a recurrence formula for the sequence {ay} defined by ap 
= 324 n2 where n> 0. 
(b) Find a recurrence formula for the sequence {ap} defined by ay = 
3" + n2+6n+ 7, wheren> 0. 
1.74 Find an explicit formula for the sequence {ap} defined by the 
recurrence relation @o = 0, @) = 1, @n = Qn-1 + Qn-2 +7, n> 2. 
1.75 Find an explicit formula for the sequence {ap} defined by the 
recurrence relation @o = 1, @1 = 3, @n = Gn-1 + Qn-2 + 2" 7, n > 2. 
How fast does ay grow? 
1.76 Prove the identity Ly = Fy_1 + Fn+y forn2 1. 
1.77 Prove the identity Fy, = (Lp_1 + Ly+1)/5 forn 2 1. 
1.78 Prove the identity Fo, = FyLp for n= 0. 


1.79 Prove the identity L3,, = L?, — 3(-1)" Ly, for n = 0. 


1.80 Find a linear recurrence relation satisfied by all cubic polynomials. 
1.81 Find a linear homogeneous recurrence relation (not with constant 


coefficients) for the sequence {ay}, where dp = 21 +n), 


1.82 A square number is a number of the form n2, where n is a 


nonnegative integer. A triangular number is a number of the form n(n + 
1)/2, where n is a nonnegative integer. Let a, be the nth number that is 


both square and triangular. For example, ag = 0, aq = 1, and ap = 36. 


Find a linear homogeneous recurrence relation with constant coefficients 
for {ap}. 


1.83 Suppose that ay = cg + yk Cjap_j, for n = k. Prove that {ap} 


satisfies a linear homogeneous recurrence relation (with constant 
coefficients) of order k + 1. 


1.9 Special recurrence relations 


Recall that a derangement is a permutation with no fixed points. Let dy 
denote the number of derangements of the set {1, 2, 3,...,.n}. We know 
that dj = 0 and dp = 1. We claim that {d)} satisfies the linear recurrence 
relation (1.31) dy =(n—1)(dn-1+4dn-2), n 23. 

In a derangement of {1, 2, 3,...,n}, the element n must occur in a cycle 
of length 2 or a cycle of greater length. There are n — 1 choices for the 
other element in a cycle of length 2, and the remaining elements 
constitute a derangement of n — 2 elements. In a cycle of length greater 
than 2, there are n — 1 choices for the element which maps to n, and the 
elements other than n constitute a derangement of n —1 elements. 

We defined the Stirling number of the second kind {{}, for 1 < k <n, 
to be the number of partitions of the set {1, 2, 3,...,n} into k parts. We 
will now find a recurrence formula for these numbers. Note that {7} = 1 
for all n, as there is only one way to partition {1, 2, 3,...,n} into one 


subset. Also, {oA} = 1 for all n, as {1, 2, 3,...,n} may be partitioned into 


n subsets in only one way. Now let us find a way to compute {";,} from 


previous values. In a partition of {1, 2, 3,...,n} into k parts, the element 
n can be alone in a part of the partition or it can be in a part with other 


elements. If it is alone, then there are { hy ways to partition {1, 2, 
...j1 — 1} into the other k — 1 parts. However, if n is in a part with other 
elements, then there are k choices for which part contains n and yn-ly 
ways to partition {1, 2,....n — 1} into k parts. Therefore (1.32) 


n n—l n—-1 
= < \ 
4 teh +A k \ for2<k<n 
From the recurrence formula, we obtain a table (Table 1.1) of values 


of {",} for small n and k. The row sums of this table are the Bell 
numbers. 


Table 1.1 Stirling numbers of the second kind {"} and Bell numbers B(n). 


NOD 1 ke Ww WH 
— ee |e 


Let us verify an entry of the table, say {43} = 6. There are six ways to 
partition the set {1, 2, 3, 4} into three subsets: {12, 3, 4}, {1, 3, 24}, {1, 
2, 34}, {13, 2, 4}, {1, 4, 23}, {14, 2, 3} (suppressing commas and one 
level of set notation). 

The Stirling number of the first kind ||, for 1 < k < n, is defined to be 
the number of permutations of { 1, 2, 3,...,n} that have k cycles. For 
example, (3] = 3, as there are three permutations of {1, 2, 3} with two 
cycles: (1 2)(3), (1 3)(2), and (2 3)(1). 

Observe that [| = 1 (there is only one identity permutation) and 
[7] = (n-— 1)! (there are (n — 1)! ways to seat n guests at a circular 
table). In a permutation of {1,...,n}, the element n can constitute a cycle 
by itself or it can follow one of the other n — 1 elements in one of k 
cycles. In the first case, there are (3 choices for dividing the other n — 
1 elements into k — 1 cycles. In the second case, there are n — 1 choices 


for which element n follows and aoe ways to divide n — 1 elements into 


k cycles. Therefore (1.33) 
" =(n-—1)! forn>1 


"| =] forn>2 
Tl 


n n-1l n—1 
= [rat @-»| k for2<k<n. 


From this recurrence formula, we obtain a table (Table 1.2) of the 
values of |;'] for small n and k. Note that the sum of the entries of the nth 
row of the table is n!, which is correct because each permutation of {1, 
2, 3,...,n} is counted. 


Table 1.2 Stirling numbers of the first kind al 


mu 1 2 3 4 5 a *T 
1 
1 
2 3 
6 11 1 


24 50 35 10 1 
120 274 225 85 15 1 
720 1764 1624 735 175 21 41 

We set {7} = ["] =0 fork > nork =0, and {?} = [3] = 1. Stirling 
numbers of the first and second kinds are linked by a simple identity: 


(1.34) tk} ~ ~ for all k, n. 


This means that the Stirling numbers are represented in dovetailing 
arrays (Table 1.3). 


Table 1.3 Stirling numbers of the first and second kinds. 


Seo oooqoo NW Ee 


jo) 


Recall that p(n) is the number of partitions of n units into an arbitrary 
number of parts, while p(n, k) is the number of partitions of n units into 
k parts. Clearly, p(n) = >¢_, p(n, k). 

A recurrence relation formula calculates p(n, k): 

(1.35) 

p(l,1)=1 

p(n,k)=0 kK>nork=0, 

p(n,k) =p(n—1,k-—1)+p(n—k,k) n>2Qandi<k<n. 


The value p(1, 1) = 1 is obvious. Since there are no partitions of n into 
more than n parts or into 0 parts, we have p(n, k) = 0 for k > n ork = 0. 
In a partition of n into k parts, the smallest part is either 1 or greater than 
1. In the former case, there are p(n — 1, k — 1) partitions of the remaining 
number n — 1 into k — 1 parts. In the latter case, the partitions of n into k 
parts are equinumerous with the partitions of n — k into k parts (just 
subtract 1 from each part in the partition of n). This proves the formula 
p(n, k) = pm—-1, k—1) + pm—k, k) forn> 2 and1<k<n. 

Tables 1.4 and 1.5 show values of p(n, k) and p(n) for small n and k. 


Table 1.4 Partition numbers p(n, k). 
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23 45 6 7 8 Q 


re 


4 5 5 3 2 


Table 1.5 Partition numbers p(n). 


3010 
3718 
4565 
5604 
6842 
8349 
10143 
12310 
14883 
17977 
21637 
26015 
31185 
37338 
44583 
53174 
63261 
75175 
89134 
105558 
124754 
147273 
173525 
204226 
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EXERCISES 


1.84 Use a computer to calculate djQo. 


239943 
281589 
329931 
386155 
451276 
526823 
614154 
715220 
831820 
966467 
1121505 
1300156 
1505499 
1741630 
2012558 
2323520 
2679689 
3087735 
3554345 
4087968 
4697205 
5392783 
6185689 
7089500 
8118264 


9289091 
10619863 
12132164 
13848650 
15796476 
18004327 
20506255 
23338469 
26543660 
30167357 
34262962 
38887673 
44108109 
49995925 
56634173 
64112359 
72533807 
82010177 
92669720 

104651419 
118114304 
133230930 
150198136 
169229875 
190569292 


1.85 Prove the formula dy = ndp_1 + (-1)" for n> 2. 


1.86 Find (with proof) a formula for [,_4'"] for n > 2. 


1.87 For n = 2, prove that among the permutations of an n-element 
set, there are as many with an even number of disjoint cycles as 


with an odd number of disjoint cycles. This explains why the 
alternate addition and subtraction of the entries of any row n, with n 
> 2, of the table of Stirling numbers of the first kind is equal to 0. 


1.88 Show that the Bell numbers B(n) satisfy the recurrence formula 


nm 


B(n+1)=)- i Blk), n>0, 


k=0 

where B(0) = 1. 

1.89 Prove that the expected number of parts in a random partition 
of {1,2,3,...,n} is (B(n + 1) — B(n))/B(n), where B(n) is the nth Bell 
number. 


1.90 Show that the recurrence relations for the Stirling numbers of 
the first and second kinds (allowing for negative values of the 
arguments) are equivalent. 


1.91 Show that p(n, k) = 24j-1p(n - k, j). 
1.92 Let b, be the number of order-preserving labelings of the 
complete binary tree with 2” — 1 nodes using the integers {1, 2, 


...52/1— 1}. Show that by = 1 and 6, = Bate for n> 2. 


1.10 Counting and number theory 


In this section, we investigate divisibility properties of factorials, 
binomial coefficients, and Fibonacci numbers. 


EXAMPLE 1.35 


How many 0's occur at the right of 40!? 


Solution: The 0's at the right of 40! occur because of factors of 2 and 5 
among the numbers 1. 2,...,40. Since there are more 2's than 5's, the 
number of 0's is determined by the exponent of 5 that divides 40!. This 


5 B= [3] =v» 


number is *=1 
In the following discussion, let p be a prime. 


De Polignac’s formula. The exponent to which p divides n! is given by 


oo 
Ll 
k=1 

Let dp(n) be the sum of the “digits” in the base b representation of n. 
For instance, if the base 3 representation of n is 1020120, then d3(n) = 6. 


Theorem. The exponent of 2 that divides n! is n — d9(n). 


Proof. Let the base 2 representation of n be 
n= bebe we ‘by bo. 


Then n = wk _9bj2! and the exponent of 2 that divides n! is 


x si | =(b) + 2b2 + 27b3 +---+2*-1b,) 
t=1 
+ (bo + 2b3 + +--+ 2*-25,) 
Biwi 
+ bx 
=(2° — 1)bp + (2) — 1)by + (2? — 1)bo +--+ + (2* — 1) by 
=n — do(n). 


Here is the general version of the theorem. 
Legendre’s formula (1808). The exponent of p that divides n! is 
n — d,(n) 
p-1 
Next we will look at divisibility of binomial coefficients. 


Theorem. If 1 < k < p—1, then (P;) is divisible by p. 


Proof. The numerator of p!/(k!(p — k)!) is a multiple of p and p does not 
divide the denominator. 


Kummer’s theorem (1852). The exponent to which p divides the binomial coefficient 


(",,) is equal to the number of carries when k and n — k are added in base p. 


Proof. We will show the proof in the base 2 case. Let j = n — k. The 


exponent to which 2 divides a) is 


n — d2(n) — (j — da(j) + k — da(k)) = do(j) + do(k) — do(n). 
Assume that the binary representation of n requires | binary digits. For 1 
<i< I, let nj, jj and k; be the ith binary digit of the expansion of n, j, and 


k, respectively; let cj = 1 if there is a carry in the ith place when j and k 
are added (in binary) and c; = 0 if there is no carry. Also, define c_1 = 0. 
We see that nj = jj + kj + cj_1 —2c; for 1 <i < 1. Hence, the ee to 


i l 
DiGi + i — m4) = SO (ei — 4-1) “ya 
i=0 


which 2 divides ip ) is i=0 


Corollary. For e > 1 and 1 <x <p*, we have 


@ =0 (mod p). 


xr 


The next theorem gives a practical method for calculating binomial 
coefficients modulo p. 


Lucas’ theorem (1878). Suppose that 0 < aj, bj < p for 1 < i < k. Then 
& +ai;p+ agp? +--: oe os =) ee i) 7 (;) (mod'p) 
by + bip + bop? +--+ depk } ~ \ bo) \bi / bo)’ \bn Pe 
Proof. The left side counts the ways of choosing bg + bip bop? ae 


+ bypk balls from a set of ag + ayp + app? ees aypk balls. Suppose 
that the balls to be selected are in boxes, with bg boxes containing a 
single ball each, bj boxes containing p balls each, bp boxes containing 


p balls each,...,and by, boxes containing pk balls each. In selecting the 
balls from the boxes, any choice of only some balls from a box leads to 


a contribution of 0 (mod p), since (P,) = 0 (mod p) for 1 < x < p*®. 


Hence, the only selections that matter (modulo p) are those that take 
none or all the balls from a particular box. This means that we need to 
select bj boxes from a set of aj boxes from which to take all the balls, 


for O < i < _k. The number of ways to do this is 


(s) (5: (8) ~ (a): 


Say that the base b representation of m dominates the base b 
representation of n if the former is greater than the latter in each place. 


mm 
k 
representation of n does not dominate the base p representation of k. 


Corollary. The binomial coefficient ( ) is divisible by p if and only if the base p 


EXAMPLE 1.36 
Is fea divisible by 7? 
Solution: We have 59 = 1: 72+1>+ 3 and 12 = 1-7+.5. Since the 


base 7 representation of 59 does not dominate the base 7 representation 
of 12, we conclude that (°8) is divisible by 7. 


Here is a charming result about numbers in a row of Pascal’s triangle. 


Erdgs and Szekeres theorem (1978). In a row of Pascal’s triangle, any two numbers 
other than the |’s have a common factor. 


Proof. Suppose that the numbers are (") and (f), with O<j<k<n. 
Then, by the subcommittee identity. 


(1) G3) = GC) 

k}\G 3) \k-Gj) 

Obviously, (") divides the right side of this relation, so it also divides 
the left side. 

However, if (") and (7) were coprime, then (3) would divide fel but 
eer ae : : n k 
this is impossible since ("*) > (5). 

It is not known whether there are infinitely many Fibonacci numbers 
that are primes. There are infinitely many composite Fibonacci numbers, 
since every third Fibonacci number is even. We will show that there 
exist relatively prime positive integers a and b such that the Fibonacci- 
like sequence defined by the Fibonacci recurrence relation and the initial 
values a and b contains no prime numbers. Our sequence {ap} is 
defined by a9 = da, a, = b, An = An-1 + An-2, 0 2 2. 

It follows by mathematical induction that 
Qn =0F,-1+0F,, n>1. 

We define 17 quadruples of integers (pj, mj, rj, cj), where 1 <i < 17. 

These quadruples satisfy the following properties: (1) each p; is prime; 


(2)pilFimi3 
(3) the congruences x = r; (mod mj) cover all the integers, that is, given 
any integer n, one of the congruences is satisfied by n. 

The purpose of the c; is to control the size of a and b. 

We define 

a=CciFm,-r, (mod pj), b=ciFm,—r,41 (mod p;), for alli. 
Such a and b exist by the Chinese remainder theorem. 


Chinese Remainder Theorem. If nj, n9,...,n, are pairwise relatively prime numbers, 
and ry, r9,....7%k are any integers, then there exists an integer x satisfying the 


zr = rr, (mod n,) 
Zz = re (mod ng) 
simultaneous congruences zr = rr (mod nx). 


Furthermore, x is unique modulo njnp ... nk. 


It follows that 

On = CF m,-1,Fn-1 + GFm;-1,41Fn (mod pi) 

Ci(Fim,-r,Fn—-1 + Fm,-r,41Fn) (mod p;) 
= ciFm,-r;4n (mod pj). 

Since Fy | Finn, we have pj | an. 


The following collection of 17 quadruples was found by Maxim 
Vsemirnov: 
(3, 4, 3, 2) (2.8.37) (5, 5, 4, 2) 
(7,8, 5,3) (17, 9, 2,5) (11, 10, 6, 6) 
(47, 16, 9, 34) (19,18,14,14) (61, 15, 12, 29) 
(23,24,17,6)  (107,36,8,19) (31,30, 0, 21) 
(1103, 48, 33,9) (181,90,80,58) (41, 20, 18, 11) 
(541, 90,62,85) (2521, 60, 48, 306) 
Using the Chinese remainder theorem, we find 
a = 106276436867, b = 35256392432. 
These are composite numbers with no common factor. 


EXERCISES 
1.93 How many 0's occur at the right of 1000!? 


1.94 Prove that (kn)! is divisible by (nik for all positive integers k 
and n. 


1.95 Prove that (";,) is divisible by n if gcd(n, k) = 1. 

1.96 Let k and m be integers such that0 <m< 2k _ 1. Prove that the 
binomial coefficient Cae is odd. 

1.97 Suppose that n has k 1's when expressed in binary. Prove that 
the number of odd entries in the nth row of Pascal’s dangle is ok 
1.98 Paul Erdés proved that there is only one binomial coefficient 
(7) with 3 < k < n/2 that is a power of an integer. Use a computer to 
find this binomial coefficient. 

1.99 Use a computer to find all ordered pairs (n, e) with 2 < e < (n— 
1)/2 < 50 and &°;=9 ("K) a power of 2. This calculation will be 
important in Chapters 5 and 6. 

1.100 For what n is n! + 1 a perfect square? 


1.101 Prove that n! cannot be a perfect square greater than 1. 


1.102 Notice that 6! = 3!5!. Can you find other instances of integers 
a, b, and c, all greater than 1, such that a!b! = c!? Is there a pattern 
to these numbers? 

1.103 Prove the following result of Erdés and Szekeres (1978): 


(CC) = 


where 0 <i <j <n/2. 


1.104 (Fermat’s little theorem) Prove that if p is a prime, then aP 


a (mod p). However, find a composite number n such that 2” = 2 
(mod n). 

1.105 Prove that Lp = 1 (mod p) if p is a prime. However, find a 
composite number n such that Ly, = 1 (mod n). 

1.106 (Perrin’s sequence) Define {ay} by ag = 3, aq = 0, ap = 2, 
and dp = an_3 + An_2, for n = 3. This sequence is called Perrin’s 


sequence. 
Prove that Plap, for p prime. However, find a composite number n 
such that nap. 


1.107 Prove that F,)|F, if and only if mjn. 
1.108 Prove that gcd(Fyp, Fn) = F gcd(™,n). 


Notes 


Pascal’s triangle is named after the French mathematician Blaise 
Pascal (1623-1662), who introduced it in his work Trait du triangle 
arithmétique (Treatise on arithmetical triangle), in which he used 
the triangle to solve problems in probability. However, the triangle 
was known much earlier in various places around the world. The 
Indian mathematician Bhattotpala (c. 1068) gave the first sixteen 
rows of the triangle. In China, the triangle is called “Yanghui’s 
triangle,” named after the mathematician Yang Hui (1238-1298). 

The inclusion—exclusion principle was first studied by D. A. da 
Silva in 1854. It was studied by James Sylvester (1814-1897) in 
1883 and is sometimes referred to as Sylvester’s cross-classification 
principle. 

Fibonacci (Leonardo of Pisa) introduced and discussed the 
Fibonacci numbers in his Liber Abaci (“Book of Calculations”) in 
1202. Abraham de Moivre gave the explicit formula for the 
Fibonacci numbers in 1730. 

Lucas numbers were first investigated by Francois Edouard 
Anatole Lucas (1842-1891). 


CHAPTER 2 


GENERATING FUNCTIONS 


Generating functions are algebraic objects that provide a powerful tool for 
analyzing recurrence relations. In this chapter, we will cover the basic theory of 
generating functions and examine many specific examples. 


2.1 Rational generating functions 


Given any sequence ag, a1, a2,...,the ordinary generating function is (2.1) 


fie) x, was 

n=0 
The generating function f(x) contains all of the information about the sequence 
{ay}, and, being an algebraic entity, it is often easier to manipulate than the 


sequence itself. The term dp is recovered by finding the coefficient of x” in f(x). 


EXAMPLE 2.1 Generating function for the 
Fibonacci sequence 
Find the ordinary generating function for the Fibonacci sequence {Fo, F1, 
| is es 
Solution: Let f(z) = $°>-_9 Fax”. We obtain 


f(z) = x2+224+227° +324 +52°+--- 
af(z) = 27+2°4+2274+32°+527°+--. 
a? f(x) = a3 +a24422°+32°+527+---. 


Through mass cancellation, the recurrence relation for the Fibonacci numbers 
yields 


f(x) —2f(2) —2?f(2) =2 
and 
7 


f(z) = ——. 


(2.2) l—-z-fZ 


The function f(x) contains complete information about the Fibonacci numbers 
and can be used to evaluate related infinite sums such as }>~_, nF,,/3”. The 
computation of this sum is called for in the exercises. 

For what values of x is the generating function valid? 
Similarly, you can show that the generating function for the sequence of Lucas 
numbers {Ly} is 

2-2 

1-2-2? 

Notice that the generating function for the Fibonacci sequence is a rational 
function. Recall that a sequence {a,} satisfies a linear recurrence relation with 


k 
an = 5 Cjian—j 


constant coefficients C1,...,Ck if i=l 


for all n => k. A sequence satisfies a linear recurrence relation with constant 
coefficients if and only if it has a rational ordinary generating function. 
Also, notice that the denominator of the generating function for the Fibonacci 


numbers, 1 — x — x2, takes its form from the Fibonacci recurrence, while the 
numerator comes from multiplying the generating function by the denominator 
and keeping only those terms of degree less than 2. 


Linear recurrence relation theorem. Let {a,} be a sequence and cj, 
...sCk be arbitrary numbers. Then the following three assertions are 
equivalent. 

(1) The sequence {a} satisfies a linear recurrence relation with constant 


k 
On=) cian, n2k. 
coefficients C1,...,Ck, 1.€., i=1 
(2) The sequence {ay} has a rational ordinary generating function of the 
form 


g(x) 
= oe xt 
where g is a polynomial of degree at most k— 1. 
(3) If 


1- Sas’ = (1—ry,xr)(1—rer)---(1—r,z), 


with then r; distinct, then the terms of {ay} are given by the formula 
Gn, = Qyr}i +--+ +anre, n2O, 
where (1,...,@% are constants. 
More generally, if 
k 
1— > on" =(l-—rjaz)™(1—rer)™ ...(1—rmr)™, 
i=1 
where the roots ry,...,77 occur with multiplicities my,...,m, then 
n=pi(n)ri+-:-+pi(n)r7?, n2O, 
where the p; are polynomials with deg p; < mj; for1 <i<l. 


Proof. (1) = (2) Assume that (ap) satisfies a recurrence relation of the type 
specified in (1). Let f(x) be the ordinary generating function for (ay). Then 


f@) = Soma" 
= Yous" aT ak 
m=k 
= Yoon" +> Ye. ix 
=1 
k co 
- Yoana" +e Danis 
i= n=k 
k-1 k oo 
aid re: 2" +0 ca" ¥ Ant” 
n=0 i=l n=k-i 
k-1 k k-i-1 
_ one” +S ez (48 _ pe ont" 
n=0 i=] n=0 
Hence 


g(z) 
f(a) = ; 
1— hy cieet 
where deg g < k—1. 
(2) = (3) First consider the case where 


1- az = (1—riz)---(1—rpx) 


t=] 


and then r; are distinct. Expanding by partial fractions, we obtain 


Qi Qk 
F(z) 1l-—ry\x i a l1—rzpr 


oo oo 
= a> rea” +-+-+ar> fre 
n=0 n=0 


a0 
— S (air? +++++ gry )z”, 
=0 


where (4,...,@; are constants. Since this is just another formula for the ordinary 
generating function for {a,}, we have 4n = Qin, tes++agry, 220. 


More generally, assume that 1 — yee c;x* has repeated roots. Suppose that r is 
a root with multiplicity m. Then, in the partial fraction decomposition of 


g(x) 
1 ot 
we have terms 
By Bo Bm 
(l—raz)’ (l-ra)??"° (l—raz)™’ 


where B14, £9,...,h are constants. By the formula 


1 ~(d+k-1\ , 
(i=) = k ): | 
k=0 

the contribution to x! in the power series for these fractions is p(n)x", where p is 
a polynomial of degree less than m. The rest of the proof that (2) = (3) follows 
as in the special case. 


Each step in the proof is reversible. 
The factorization of 1 — pe c;x* called for in the proof (and in practice) can 
be accomplished using the change of variables y = 1/x. Then 


k K 


k 
f=] i=l rar] 
The problem is reduced to factoring the polynomial 
k 


y* - Say. 


i=1 
This polynomial is the characteristic polynomial of the recurrence relation. 


EXAMPLE 2.2 


Find the generating function for the sequence defined by the recurrence 

relation dy = 6ayn_1 — 9ay_p for n = 2, and ag = 1, aq = 1. (This comes from 

Example 1.31.) Use the generating function to find a direct formula for ap. 
Solution: The form of the recurrence relation tells us that the denominator of the 


generating function is 1 — 6x + 9x2, To get the numerator, we calculate 
(1 — 62 + 927)(ao9 + az) = (1-62 + 92”)(l+2)=1-—52+---. 


The only terms of degree less than 2 are 1 — 5x, so the numerator is 1 — 5x. 
1 — Sz 


Hence, the generating function is 1 — 6x + 9x?" 
To find a direct formula for ay, we write the generating function as 
(1 — 5a)(1 — 3x)~*. 
Thus, we have a binomial series with a negative exponent. Its expansion is 
oO 
—2 
kak k 
(1 ~ 52) }>(-1)*3 ( : )z 
k=0 
Therefore 


—2 —2 
— (—_ ])"r2Qn hi) n—1gn-] 
ay, = (—-1)"3 (;?) 5(—1)"-°3 ten) 
=3("*") ee | n ) 
n n—1 


= 3"(n hs 1) = 5n3”"—! 


=3*-—2n3""', n>0. 
This is the same solution we saw before. 


EXAMPLE 2.3 Change for a dollar 


How many ways can you make change for $1.00 using units of 0.01, 0.05, 

0.10, 0.25, 0.50, and 1.00? Here are some examples: 

5+10+ 10+ 25+ 50 

1+1+1+1+1+5+410+10+10+ 10+ 25+ 25 

25 + 25 + 25 + 25. 
Solution: For n= 0, let ay be the number of ways to make change for an amount 
n. We set ag = 1. The generating function for {ap} is 
f(z) =1+ 12+ 1x? + 123 + 124 + 2° + 2x + 2x” + 2x + 229 4+ 4719 +..- 

This generating function has the rational form 

1 
0.31 = T= ad (0 — 2) — aT — 2) — HOO 


Using a computer algebra system, we find that the coefficient of x00 oF the 
generating function is 293, i.e., there are 293 ways to make change for a dollar. 
The factors in the denominator of the generating function give rise to 


geometric __ series. For example, the second _ ffactor gives 
1 


1-2 = 1p oP 8 4 P54 td 4 gh 4 G65. 


Each term in the product corresponds to a way to make change for a dollar. 
For example, the term corresponding to the sum 5 + 10 + 10 + 25 + 25 + 25 is 
shown in boldface: 
(l+z2t+a2?4+2%+244+2°+2°+---) 

(1+ x5 4 25 4 95 4 oA 4 95 4 gO5 4...) 

(1+ lO 4 x 210 4 73:10, 74:10, 75:10 4 76-10, ...) 

(14 25 4 g 225 4 48-25 4 425 4 75-25 4 96-254 |.) 

(1 + 50 4 250 4 73-50 4 74:50 4 75:50 4 76-50, |.) 
( 


(1 + 2100 4 72100 | 3-100 5 74-100 | 5-100, 76-100 4...) 
Since the denominator of the generating function is a polynomial of degree 
191, the sequence {aj} satisfies a linear recurrence relation of order 191. The 


explicit formula for aj, involves a sum of exponential functions which are 


powers of 100th roots of unity. 


EXAMPLE 2.4 Alcuin’s sequence 


How many incongruent triangles have integer side lengths and perimeter 


19109» 


Solution: Let t(n) be the number of triangles with integer side lengths and 
perimeter n. 

We set t(0) = 0. It is easy to find the values in the following table: 

GUO Lw t 4S & TF SB 

im Tee 1.oT 1.9 4 
The sequence {t(n)} is known as Alcuin’s sequence, named after Alcuin of York 
(735-804), who wrote a problem solving book containing some allocation 
problems equivalent to finding integer triangles. 

The generating function for {t(n)} is rational: 

oO 7 re 

ones” T= 

We prove this by showing that we can obtain any integer triangle from (1,1,1) 
by adding nonnegative integer multiples of (0,1,1), (1,1,1), and (1,1,2). These 
triples satisfy the weak triangle inequality, which is sufficient since we start with 
a genuine triangle. The unique’ solution to the equation 
(a, b,c) = (1,1, 1) + a(0, 1,1) + B(1, 1,1) + 4(1, 1, 2), 
in nonnegative integers a, B, andy isa=b—a,B=a+b-—c—1,y=c-—b. Since 
2a + 38+ 4y=a+b+c—3=n-3, we see that t(n) is equal to the number of 
ways of writing n — 3 as a sum of 2's, 3's, and 4's (order of terms is unimportant). 
This is equivalent to the number of ways to make change for n — 3 using any 
combination of 2's, 3's, and 4's. 

Since the denominator of the (rational) generating function is 

(1 —2?)(1—2°)(1 —2*) =1—2? -29 — 24 42° 42% +27 - 2%, 
the sequence {t(n)} satisfies the order nine linear recurrence relation 

t(n) = t(n—2)+t(n—3)+t(n—4)—t(n—5)—t(n—6)—t(n—7)+t(n-9), n>Q. 
The initial values, for 0 <n < 8, are given in our table. The recurrence relation 
allows us to compute moderately large values of t(n) easily by computer. 

The generating function has the partial fraction decomposition 

SES | ee ee ee 

24(x-—1)3 288(2-—1) 16(2+1)? 32(@+1) 8(2?4+1) 9(2?+2+41) 

From this expansion we will find a simple formula for t(n). 

By the binomial series theorem, the first four terms can be written as 


1 /-3 nan Bera lLeaf-2\.. 1 
ae ae =}, ee an ; 1, Soa es | nm il 
py ae 288 —* a(n)? 32 24 Ne 


From the identity (7) = (-1)F (ntk-1 the coefficient of x" is 
1 (n+2 13 1 i 

peat ie Be Se eee = | m 1 7m 

xi ) 288 ig FY — ga) 


n 16 38 
_ 6n? + 18n — 1 — 18n(—1)" — 27(-1)” 
= ne ae. 
The final two terms yield coefficients of x” that follow a pattern modulo 12, 
Le., c/72 where Cc is given in the table: 


nmod1zs OF Ll 2 FS # 6 6 T 82 8 HO 

c 7 -17 1 2 -17 -17 25 1-17 7 «1~« 1 

Thus, we obtain the formula t(n) = n2/48 + (c — 7)/72 for n even, and t(n) = (n 
+ 3)2/48 + (c — 7)/72 for n odd. This can be represented as (2.5) 


a for n even 
t(n) = 2 
{ ae} for n odd, 


where {x} is the nearest integer to x. For example, to find the number of integer 


triangles with perimeter 109100, we compute {1029948} by long division, and 
obtain the answer 2083 ... 3, where there are one hundred ninety-six 3's. 


EXAMPLE 2.5 


We can prove Lucas’ theorem (see Section 1.10) using generating functions. 
Let’s examine the case k = 2. 
From the binomial theorem modulo p, we have 


(1 4 x)aoterrtazp® = (4 4 7)9(1 + 2)™(1-4+ 2)?” (mod p) 
gna Yala a) 
(1 + (“a + aie + ie +: ) 
(G+ Q)-Q)ere) ea 
EC) G)()a a 


From the uniqueness of base b expansion, we conclude that 


ay\ (a,\ (a2\ — (ap +a,p + agp? | 
(=) (2) (2) = (greete) cmin 
EXERCISES 


2.1 Evaluate the infinite series £°),=1 nF)/3". 
2.2 Find a rational generating function for the sequence {ay} given by the 
recurrence formula 

ag = 0), a, = i an = 5an-1 — 6a,—2, Tl > a. 


2.3 Evaluate the infinite series 
oo 


n On 
DEI 


n=0 


where {dp} is the sequence of the previous exercise. 
2.4 Find a rational generating function for the sequence of perfect squares, 


{n2}, forn> 0. 
2.5 Find a rational generating function for the sequence {ay} defined by ag 


= 1, ay = 3, dn = Gn_1 + An? + 2-2 for n> 2. See Exercise 1.75. 


2.6 (a) Suppose that f(x) is the generating function for a sequence {ap). 


_ £0) 


Show that @ n!} 


(b) Suppose that f(x, y) is the generating function (in two variables) for a 
_ OP Oy f(0, 0) 


mn — 


a 
sequence {dm,n}- Show that min! 


2.7 Use a computer and an appropriate generating function to determine 
the number of ways of making change for $1 using an even number of 
coins. 
2.8 Use a generating function to determine the number of solutions in 
nonnegative integers to the equation 

a+ 2b+4c = 10" 
This problem duplicates Exercise 1.53. 
2.9 Determine the number of solutions in nonnegative integers to the 
equation 

a+ 2b+ 3c = 10°°. 


2.10 Determine the number of solutions in nonnegative integers to the 
equation 

a+b+4c= 10". 
2.11 (a) Show that the generating function (in two variables) for binomial 
coefficients is 

1 
l-r-y 
(b) Show that the generating function (in three a for 


multinomial coefficients of the form (k,k2,k3") is 1 -— ££ — y— z 


2.12 Find a recurrence formula for the coefficient of x” in the series 


expansion of (1 + 3x + x2 b 


2.13 Find a recurrence formula for the coefficient of x” in the series 
expansion of (1 + x + 2x2) 1, 

2.14 (Series multisection) Let f(x) = £°,=09 anx". Show that, for any 
positive integers p and q with O < p < q, we _ have 


oo 1 ae 
k=0 j=0 


where @ is a primitive gth root of unity. 

2.15 Prove that, for each positive integer k, there exists a monk 
polynomial p(n) of degree k + 1 with integer coefficients such that 
™ 


a i* (‘) = 2"—*p(n). 


2.16 Prove that Alcuin’s sequence is a zigzag sequence (its values 
alternately rise and fall) for n = 6. 
2.17 Prove that Alcuin’s sequence {t(n)} satisfies the recurrence 
relation 

t(n) = 3t(n — 12) — 3t(n — 24) + t(n — 36), n> 36. 
2.18 Let t, be the number of triples (a, b, c), where a, b, c are 
nonnegative integers satisfying a< b<c,a+b2=c,anda+b+c=n. 
Find the generating function for {t,} and use the generating function 
to find t, for0 <n <6. 


2.19 Given any integer m > 1, prove that Alcuin’s sequence {t(n)} is 


periodic modulo m with period 12m. 


2.2 Special generating functions 
In this section we investigate certain interesting sequences and their 
generating functions. 

Given any sequence dg, aj, dj,...,.we define the exponential 


oO r 
re 


f(z) = 

generating function (2.6) n=0 
Let d(x) be the exponential generating function for the sequence 
{dy}, where dp, is the number of derangements of n elements: 


co Pak 
d(x) =) dn—. 
n=0 . 


We set dg = 1. From the recurrence relation (1.31), it follows that 
(1 — x)d'(x) = xd(z), 
for 


n! 


(1 — x)d’(zx) 


\ 
I 
& 

or 
| 
= 
| 
Me 
= 
> 
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= 


tdn—1 n 
~ =n n! 
_ = da—1 n 
7 Dei 
= ga(z). 


Separating variables, we obtain 
Js (2) dz = | —— dz, 
d(x) l-gf 
and hence 


= 
The value dg = 1 implies that C = 1. Therefore 
e = 


Cie 
As we mentioned earlier, the coefficients of the generating function 
give us a formula for the general term of the underlying sequence. In 

d, = 9 (-1)'=. 
this case, we obtain the explicit formula (2.8) i=0 

We now consider a famous sequence of numbers called Catalan 
numbers, named after the mathematician Eugéne Charles Catalan 
(1814-1894). 

The Catalan number Cp is the number of sequences of n A’s and n 
B’s which have the property that, reading the sequence from left to 
right, at each symbol the number of As seen thus far is always greater 
than or equal to the number of B’s seen thus far. 

The five valid sequences for n = 3 are 

AAABBB 

AABABB 

ABABAB 

ABAABB 

ABAABB. 

Hence C3 = 5. 

With a little work, we find the following values of Cy: 

n CE 2 &- a 

Ch 1 2 5 14 
We set Cg = 1. 

Let f(x) = 33*, Cnrz” be the ordinary generating function for the 
Catalan numbers. The strategy is to find an equation satisfied by f(x), 
solve the equation, and read off the coefficients. The Catalan numbers 


n-1 
Cr = > Oice n> 1. 
satisfy the recurrence relation (2.9) k=0 
(Every valid sequence can be written uniquely in the form As7Bs9, 


where sj is a valid sequence with k A’s and B’s, and s> is a valid 


sequence with n — 1 —k A’s and B’s.) This recurrence relation implies 

f(z) =14 2f(z). 

From the quadratic formula and the fact that f(0) = 1, it follows that 
1—vl—4z 

010) = ee 

From the binomial series for ,/1 — 4x, we obtain 


1 2n 
C= : 
(2.11) nmt+til\n 
It is easy to show from (2.11) that 
_ 2(2n — 1) 
(2.12) " n+l 
The Catalan numbers occur in many settings, such as the following: 
«» Cy is the number of lattice paths in the first quadrant of the plane 


Canty reel, 


which start at (0, 0), end at (2n, 0), and proceed at each step by 

(Ax, Ay) (+1, +1) or (+1, -1). 

« Cy—1 is the number of binary search trees on n vertices. 

» Cy_2 is the number of ways to parenthesize a product of n terms. 

» Cp_2 is the number of triangulations of a convex n-gon with n — 

3 nonintersecting diagonals. 

Next, we determine and work with generating functions for the 

Stirling numbers of the first and second kinds and for the Bell 
numbers. 


We observe a pattern in the generating function for the Stirling 


numbers of the first kind. For instance. 
3 
> i a* = Ir + 3x? +123 = x(x +:1)(x + 2). 
k=l 
Define the failing factorial function x(n) as 


(2.13) 2(n) = 2( — 1)(e — 2)-+-(e—n +1) 
and the rising factorial function x) as 
(2.14) 2 =2(2 + 1)(2+2)-+-(2+n- 1). 


We set x() = x(Q) = 1. 


The polynomials xp) and x() are related by 


(—2)iny) = —2(-—z—-1)---(-—7-—n+1) 
= (-1)"z(r+1)---(x+n-1) 
(2.15) = (-1)"2™. 


The following theorem states that x(1) is the ordinary generating 
function for the Stirling numbers of the first kind. 


Generating function for the Stirling numbers of the first kind. 


ri”) = > i zk, 
(2.16) rer 


Proof. The proof is by induction on n. The case n = 1 is trivial: 
e) = ¢ = [F]2". 
Assuming the result for n, it follows that 

cD = g(r +n) 


= 3 n+1 ak 
= k a” 
which is the correct formula for the n + 1 case. 


EXAMPLE 2.6 Expected number of cycles in a 
permutation 


What is the expected number of cycles in a randomly chosen 
permutation of {1,2,3,...,n}? 


Solution: Differentiating the generating function for [7x] with respect 


Das” = Da5e [te 


D,{z(z + 1)-+--(x+n—1)] 
to x, we obtain k=1 
Evaluating the second identity at x = 1, we obtain 


n! (145454645) = > AG 
2 3 n —_ k 
Dividing by n!, we find that the expected number of cycles is 
L414 1 

bre rer, 
an expression asymptotic to In n. For instance, in a permutation on 
1000 elements, we expect to find about seven cycles. 
The polynomial X(n) is the generating function for the signed Stirling 


numbers of the first kind, (-1)"*k [7x]. 
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Generating function for the signed Stirling numbers of the first kind. 
Tt 
+k1™] k 
za) = > (-9r | ot 
k=1 


Proof. We have 
Zn) = (-1)"(-2)™ 


(2.17) 
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_ = _4\ntk |] k 
= 2 Li Re ; 
To test this generating function, we compute X(3) = x(x — 1)(x — 2) = 


x3 — 3x2 + 2x and note that the coefficients 1, -3, 2 are the signed 
Stirling numbers of the first kind with n = 3. 


Generating function for the Stirling numbers of the second kind. 


zm = S7(n,§)(2) 


. tif 
= (k): 
ae 

The vector space of polynomials with real coefficients has as bases 
the two sets By = {x : n> 0} and Bp = {x(n) : n 2 O}. Recall that we 
have set {9} = [0] =1 and {$} = [5] =0 for n > 1. Then 
S,; = [(-1)"**|?]] is the change of basis matrix from B to By, while 
So = [{Z}] is the change of basis matrix from By to B2. Therefore, the 
two matrices Sj and S> are inverses, so that S4S9 = S9S 1 = I, where I 
is the infinite-dimensional identity matrix. In summation form, this 


assertion is written as (2.19) 
n _4\n+k nr i = Lis Ls aN | i 5 i 
Set a] oe 


Recall that é(n, j) = 1 if n =j and 0 if n # j. This identity leads (as per 
Exercise 1.59) to the following wonderful inversion formula. 


Inversion formula for summations with Stirling numbers as weights. For any two 


tn) = 3° ue 


real-valued functions f and g, we have k=1 
if and only if 


f(n) = eo ot. 


Proof Assume that g(n) = >y_, {7} f(#). Then 


n 


Soy oe 


k=1 
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= f{(n). 
The reverse implication is proved similarly. 

Let t, be the number of transitive and reflexive relations on an 
arbitrary n-set X. It can be shown that t, is the number of topologies 
on an n-set. Let py be the number of partial orders on X. The reader 
may wish to verify that pj = 1, pp = 3, p3 = 19 and ty = 1, tp = 4, tg = 
29. Although there are no known formulas for t, and pp, the two 
functions are related by our inversion formula. 

Suppose X = {1,...,n} and R is a transitive and reflexive relation on 
X. We define a new relation R’ on X as follows: (a, b) € R’ if and only 
if (a, b) € R and (b, a) € R. We claim that R’ is an equivalence relation 
on X. Certainly R’ is reflexive, as (a, a) € R implies (a, a) € R’. If (a, b) 
€ R’, then (b, a) € R’ (by definition), so R’ is symmetric. If (a, b) € R’ 
and (b, c) € R’, then (a, b), (b, c), (b, a), (c, b) € R, which implies that 
(a, c), (c, a) € R (because R is transitive); hence, (a, c) € R’, so R’ is 
transitive. 

Therefore, R’ is an equivalence relation with equivalence classes [a] 
= {be xX: (a, b) € R and (5, a) € R}. This means that in order to 
construct a transitive, reflexive relation on X, we must first partition X 
into equivalence classes. As X may be partitioned into k equivalence 
classes in f(x) = 3°, F,x”™ ways, the question is, how are the 
equivalence classes pieced together? Suppose [a] and [b] are two 
different equivalence classes under R’ and (a, b) € R. By transitivity, 
(a, c)€R for all c € [b]. Again, by transitivity, (d, c) € R for all de [a]. 
To paraphrase: “Everything in [a] is arrowed to everything in [b].” 
Therefore we can think of the equivalence classes as points that are 
either joined or not joined by an arrow. This defines a partial order on 
the set of equivalence classes. In other words, the k equivalence 


classes are partially ordered. Because this may be done in px ways, 
summing over all possible values of k, we obtain (2.20) 
n 
t, = 
2 { «ben 
We test this equation by putting in the values 


pi = 1, po = 3, p3 = 19, {3} = 1, {3} = 3. {3} = J, and calculating 
t3 = 29. 
Our equation would provide a formula for t, if only a formula for py 


were known (as we already have a formula for io): However, using 
the inversion formula for Stirling numbers we can write py in terms of 


Zor 
ty: (2.21) " 2 " Fi i 
For example, putting in the values 
t; = 1, tg = 4, tg = 29, (—1)5* [7] = 2, (-1)** fF] = -3, (-1)** [9] =1 
, we obtain p3 = 19. 

Although no explicit formulas are known for the general terms of the 
sequences {py} and {tp}, it is known that py ~ ty and log? pp = n2/4 + 
o(n2). We let cee and be the unlabeled set versions of py and tp. 
There are no known formulas for these sequences, although it is 
known that Pn ~ py/n!. One might think that a = 7-4 p(n, kp 
but this is false. Why? 


Open problem. Find explicit formulas for pp, ane tp, and ae 


Table 2.1 presents the first few terms of the four sequences. See [24] 
or the Online Encyclopedia of Integer Sequences. 


Table 2.1 Some terms of four important sequences. 


3 4231 130023 6129859 431723379 
2 5 16 63 318 2045 16999 
4 
3 


29 355 6942 209527 9535241 642779354 
35979 


— eet 


139 718 4535 


Now we give the exponential generating function for the Bell 
numbers B(n). 


Generating function for the Bell numbers. Sau B(n)x"/n! = of 


Proof. We have 


ae n Amiga. 
yee = et yay 


n=0 n=0 " j=0 


= £ e 
= £ 
What an interesting-looking generating function! The reader may 
wish to compute the first four terms of the generating function and 
compare them to the known values B(0) = 1, B(1) = 1, B(2) = 2, B(3) = 
Ds 


EXAMPLE 2.7 Rook walks 


A chess Rook can move any number of squares horizontally or 
vertically on a chess board. How many different walks can a Rook 
travel in moving from the lower-left corner (al) to the upper-right 
corner (h8) on the board? Assume that the Rook moves right or up 
at every step. For example, Figure 2.1 shows the Rook walk al-cl- 
dl-d3-d5-f5-f7-h7-h8. 


Figure 2.1 A Rook walk from al to h8. 
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Note that if the Rook could only move one square at a time, this 
problem would be equivalent to the problem of counting paths 
along city streets (see Example 1.14). Here the number of such 
paths is simply (7) = 3432. We expect that the number of Rook 
walks will be much larger. 

Solution: We generalize the problem by considering Rook walks to 
any square on the board (with the Rook starting on al and moving 
toward the goal square at every step). We make a table displaying the 
number of walks. The bottom-left entry is the number of Rook walks 
from al to al, which is 1. Each other entry is the sum of all the entries 
below or to the left of the given entry. The reason is that the Rook’s 
last move must come from one of the squares represented by these 
entries. For example, the entry corresponding to the c4 square is 4 + 
12 +2+5 + 14 = 37. It’s possible to complete the table by hand in a 
few minutes. The number of Rook walks from al to h8 is 470,010. 
(The dots in the table indicate that we can generalize the problem to 
arbitrarily large chess boards.) 


64 320 13828 4864 16428 52356 159645 470010 
32 144 560 1944 6266 19149 56190 159645 
16 =64 232 760 2329 6802 19149 52356 
8 28 94 289 838 2329 6266 16428 
4 12 37 106 289 760 1944 4864 
2 5 14 37 94 232 960 1328 
1 2 5 12 28 64 144 320 
1 


1 2 4 8 16 32 64 sara 
Let a(m, n) be the number of Rook walks from (0, 0) to (m, n), and 


set a(m, n) = 0 if m < 0 or n < O. The generating function for the 
doubly infinite sequence {a(m, n)} is the rational function (2.22) 
2, 2, almnyere” = 7 er ae - oe are 
m=0 n=0 ss it ' 
Can you explain why? 

It follows that the sequence a(m, n) satisfies a recurrence relation 
(with initial values): 

a(m,n) = 2a(m,n — 1) + 2a(m—1,n)-3a(m—1,n—1), m>2orn>2; 

a(0,0) = 1, a(0,1)=1, a(1,0) = 1, a{1,1) =2. 
Can you explain why this recurrence relation holds by counting Rook 
walks? 

Let an = a(n, n), for n = 0. The sequence {ap}, called the diagonal 
sequence, is !, 2, 14, 106, 838, 6802, 56190, 470010, ... . 
The eighth term, 470,010, is the number of Rook walks in the original 
problem. 

Let f(x) be the generating function for the diagonal sequence, i.e., 

f(z) = 1+ 22 + 142? + 106z° + 8382+ + 6802zr° + 56190r° + 47001027 +---. 
We will show that 


a Vl—-—z 
(2.23) Ka)= 9 (: = =) 


In order to get the generating function for the diagonal sequence, we 


make the change of variables t = x/s. To include terms such as 531°, we 


allow arbitrary integer exponents for s. For instance, we represent 331° 


as s2x°. The diagonal generating function is the coefficient of 3°. For 


more about this method, see [25]. 

We obtain the generating function 

l1—s-—az/s+cr =5 (1+ (1—2)s ) 
1—2s—2z/s+32 2 —28? + (32 + 1)8 — 22] 
We now consider the function 
8 

—2s? + (324+ 1)s— 2x 

Using the quadratic formula, we write this as 
8 


—2(s —a)(s — 8)’ 
where 


ag — Seti- VU —z)(l—92) ,_ 8441+ (1 ~2)(1— 92) 


,B 


We put our function into partial fraction form, 


1 Q B 
2(8 — a) = 7 | 
or 
a ee Hl | 
2(8— a) |1—(a/s)  1—~(6/8) 
For —1/9 < x < 1/9, we expand the function as a Laurent series in the 
annulus |a| < |s| < |6| in powers of a/s and s/f, obtaining 


moa (2) +E (3) | 


=0 
The coefficient of s? in this series is 
l 2 1 

2(86-a) /l—2)(1—9a) 
This establishes the generating function for the diagonal sequence. 

We can use the generating function for the diagonal sequence to 
obtain a recurrence relation for it. By inspection, 

f'(x)(1 — x)(1 — 9x) = 4f (x) — 2. 


We can read off a recurrence relation for {a,} by looking at the 


1 xt 


n 


coefficient of x in the above relation: (2.24) 


ag = 1, a) = 2 


(10n — 6)a,-1 — (9n — 18)a,~2 


On = n> 2. 


n 
A counting argument for this relation is provided by E. Y. Jin and M. 


E. Nebel in “A combinatorial proof of the recurrence for rook paths,” 
in Electronic Journal of Combinatorics, 19, no. 1 (2012). 

We can also use the generating function to determine the asymptotic 
growth rate of aj. We have 


oe ee 

(2.25) "3 van’ 
See Exercises. 

Can you find the generating function for the number of King walks 
from the lower-left corner to the upper-right corner of an arbitrary-size 
chess board? At each step, the King moves one square to the right, one 
square up, or both. Such walks are called Delannoy walks. 


EXERCISES 
2.20 Prove that 


EIQ-(20l-saG) 

nur k k-1 n+1\n 

2.21 Prove that the Catalan number Cy is odd if and only if n = ak 
— 1 for some positive integer k. 


2.22 Prove that for any positive integer n we have 

Can-1 = Can = C3n41 (mod 3). 
2.23 Prove that if n > 3, then the Catalan number Cy is not a 
prime. 
2.24 Define the n x n matrix A = [ajj] by the rule ajj = Cj+j_2. 
Prove that det A = 1. 
2.25 (Ising problem, 2 x n case) Let O(a, b) be a box consisting of 
a+ b cells, each 1 x 1, arranged in a row of length a sitting on top 
of a row of length b (the leftmost cells of the two rows line up). 
Let f(a, b) be the number of ways of covering O(a, b) with 1 x 1 
and 1 x 2 tiles. Set up a recurrence relation for f (a, b). Generate 
some data using a computer and make a conjecture about a 


formula for f(a, b). 
2.26 Prove that }>_,(—1)"** [2] B(k) =1 


2.27 Prove that 
. mn n ae 
(e+y™=)>° (a 2 
k=0 \~ 
and 
ue Tr 
(r+ y)(n) = > (j,) ea un-r) 
k=0 


2.28 Let X be the number of cycles of a randomly chosen 
permutation of {1,...,.n}. We have shown that E(X) ~ In n. Show 
that Var(X) ~ In n. 
Hint: From the relation 2 = S*y_, |?|2* obtain 

Var(X) = ete?” | : 

nidx dz als 

Use logarithmic differentiation. 
2.29 Let ay, be the number of permutations in S, which alternately 
rise, fall, rise, fall, etc. For example, 142635 is such a 
permutation. Find }>" , anz"/n! and use this information to find 
dg. 
2.30 Let {ay} be a sequence with the exponential generating 


function 
3 cl =exp/| z+ au 

a mt aia ie elisa 
Evaluate the sum 

k 

sik 

Soca Jas. 
i=0 : 
Interpret the sum combinatorially. Using Exercise 1.58, find a 
combinatorial interpretation of ap. See Exercises 1.37, 2.45, and 


2.69. 

Hint: Multiply both sides by e ~. 
2.31 A lone King is on a chessboard. How many paths may the 
King take from the lower-left corner (al) of the board to the lower- 


right corner (h8), moving one square right, up, or up-right at every 
step? 

2.32 Find a recurrence relation for the number of Queen walks 
from the lower-left corner of an arbitrary-size chess board to the 
square (n, n). (A Queen can move any number of squares 
horizontally, vertically, or diagonally. Assume that the Queen 
always moves to the right, up, or up-right.) Find the corresponding 
generating function. 

2.33 Suppose that a ChildRook can move like a chess Rook but 
only at most two squares horizontally or vertically. Let a(m, n) be 
the number of walks a ChildRook can take from square (1, 1) to 
square (m, n) on an arbitrary-size chess board. (Assume that the 
RookPlus always moves toward the goal square.) Find a finite- 
order recurrence relation for a(m, n). 

2.34 Suppose that a RookPlus can move like a chess Rook with 
the additional possibility of moving one square diagonally. Let 
b(m, n) be the number of walks a RookPlus can take from square 
(1, 1) to square (m, n) on an arbitrary-size chess board. (Assume 
that the RookPlus always moves toward the goal square.) Find a 
finite-order recurrence relation for b(m, n). 

2.35 Show that 

Jo. > 

3 Van’ 


where dp is the number of Rook walks from (0, 0) to (n, n). 


ayn ™ 


Hint: We know that ap, for n > 1, is the coefficient of x" in the 


generating function $V —az/V1—92. We examine this 
generating function, which has a singularity at x = 1/9. Let y = 


9x and h(y)=$/1—y/9=hothiythey?+--. Then 
ho +hi +ho +++: = A(1) = V2/3. Suppose that (1 - yy 1/2 = 
SO + S4X - sox? + ; Then 
an = (hosn — hi 8n—1 a hy8o)9" ae h(1)s,9”, 

and, using Stirling’s approximation n! ~ n"e~"./27n, we have 


1 
4 1 
Sn = . —l = ad “ 
: © \ ) Vin 


2.3 Partition numbers 


In Chapter 1, we defined the partition number p(n, k) to be the 
number of ways to write n as a sum of k positive integers. We can 
also think of p(n, k) as the number of onto functions f: X — Y, |X| 
= n, |Y| = k, where X and Y are unlabeled sets. Also, the partition 
number p(n) has been defined as p(n) = 2",=1 p(n, k). In this 
section we develop generating functions for partition numbers. 


EXAMPLE 2.8 
Determine p(4, k), for k = 1, 2, 3, 4, and p(4). 
Solution: We have 
p(4,lj)=1 (4=4) 
p(4,2) = 2 (2+2=4,3+1=4) 
p(4,3)=1 (2+1+1=4) 
p(4,4)=1 (1+1+1+1=4) 


and 
p(4) =  p(4,1) + p(4, 2) + p(4,3) + p(4, 4) 
1424141 
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Suppose that X and Y are unlabeled and f: X — Y is an onto 
function (|X| = n, |Y| = k). Suppose that the inverse images of 
the elements of Y have cardinalities Aj, ..., Ax. Because the 


inverse images account for all the elements of X, it follows 
that Ar tA2+--- +A, HN. 


Furthermore, we may assume that the A; are ordered from 


largest to smallest: 

A 2A22°°° 2A > 0. 

A partition Ay + +++ +A, =nmay be pictured with a Ferrers 
diagram consisting of k rows of dots with A; dots in row i, 
where 1 <i <k. The Ferrers diagram for the partition 7 + 3 + 
1+ 1= 12 is shown in Figure 2.2. 


Figure 2.2 The Ferrers diagram of a partition of 12. 
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The transpose of a Ferrers diagram is created by writing 

each row of dots as a column. The resulting partition is called 
the conjugate of the original partition. For example, the 
partition 12 = 7+ 3 + 1+ 1 of Figure 2.2 is transposed to 
create the conjugate partition 12 =4+2+2+1+1+1+10f 
Figure 2.3. 
Figure 2.3 A transpose Ferrers diagram. 
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The reader may enjoy matching each partition of 4 on the 
previous page with its conjugate. (One partition is self- 
conjugate.) We now give the ordinary generating function for 
the partition numbers p(n). For convenience, we set p(0) = 1. 


Generating function for integer partitions. 


fae) fae) 1 
¥ vlna" = T] <5. 
n=0 k=1 1-z 


Proof. We need to show that the coefficients of x on the two 


sides of the equation are equal. The coefficient of x” on the 

left side is patently p(n). On the right side, the product may be 

written as 
co 1 co 

II <8 = [[G +28 +27 +0 +04 +.--). 

c= k=] 


To find the contribution to x” from this product, suppose the 


m(k)k 


term x is selected from the kth factor and these terms are 
multiplied to yield x™(1)+m(2)2+" T¢ this expression equals 
x1 then (2.26 m(1) + m(2)2 teee=n. 


Contributions to x” correspond to solutions of (2.26). These 
solutions may be envisioned as Ferrers diagrams for partitions 
of n. With t as the greatest integer for which m(t) is nonzero, 
we create the Ferrers diagram with m(t) rows of t dots, 
followed by m(t — 1) rows of t — 1 dots, etc. This 
correspondence between solutions of (2.26) and partitions of n 
completes the proof. 

Determining the ordinary generating function for p(n, k) 
with k fixed is not difficult. We start by making an elementary 
observation from the transposes of Ferrers diagrams. 


Theorem. The number of partitions of n into exactly k parts is equal to the 
number of partitions of n where the size of the greatest summand is k. 


Now we can establish the generating function for p(n, k). 


Theorem. 
co k 1 
kz” = x* -, 
2 me =e ig 
n=k j=l 


Proof We define p(n, < k) to be the number of partitions of n 
into at most k parts. From the previous theorem, p(n, < k) is 
also the number of partitions of n into parts of size at most k. 
Clearly, p(n, k) = p(n, < k) — p(n, < k — 1). We obtain 


co k 
S— p(n, < k)2” = [[Q4+ 27 +2%4+c%+---) 
n=k 


It follows that 


oo 


: > P(r, < k) — p(n, < k — 1)]2” 


Wt 
= 
3 

ran 
a 

| 


n=k n=k 
_ k 1 k-1 1 
- l-g7 441-27) 
j=1 j=l 
k 
1 k 
= [I 3h-a-"y 
j=l 
9 
= k 
ie ls 
j=l 


Let p(n, O) be the number of partitions of n into summands 
each of which is an odd number. Let p(n, D) be the number of 
partitions of n into distinct summands. As a further illustration 
of the use of generating functions, we prove that p(n, O) = 


p(n, D). 
Euler’s theorem (1748). 
p(n, O) = p(n, D). 


Proof. We have 


oo " 1 
dwn, O)x ee as 


(1 — x?) (1 — at) (1 — x) 
(1—2)(1—2?) (1—23)(1—24) (1—25)(1—2*) 
(l-2?) (1-at) (1-2*) 


(l—2z) (1-2?) (1-23) 
= (1+2)(1+27)(1+ 2°)... 


oo 
= S— p(n, Da". 
n=0 


The desired equality follows by comparing coefficients of the 
two generating functions. 


EXERCISES 

2.36 Show that the number of partitions of n into summands 
none of which occurs exactly once is the same as the number 
of partitions of n into summands none of which is congruent 


to 1 or 5 modulo 6. 
2.37 Find formulas for p(n, 1), p(n, 2), and p(n, 3). Find an 
asymptotic estimate for p(n, k) (with k fixed). 


2.38 Prove Jacobi’s identity: 


2° Pi a 1 
1 ee eee: 7 (Pee. er 
+2) Goce ep lia 


2.39 We say that two permutations o and t of Spy are in the 
same conjugacy class if there exists a permutation p € Sp such 


that T = pop. Prove that two permutations are in the same 
conjugacy class if and only if they have the same cycle 
structure. How many conjugacy classes of S, are there? 


2.40 
(a) How many nonisomorphic abelian groups of order 
2700 are there? 
(b) How many ways may one make $2.23 postage using 1 
cent, 2 cent, 3 cent, 10 cent, 20 cent, $1, and $2 stamps, 
and not more than three of any one denomination? 
Notice that the answers to parts (a) and (b) are the same. 
Why is this? 


2.4 Labeled and unlabeled sets 


Many enumeration problems can be solved by representing the 
objects to be counted as functions f : X — Y, where X and Y are 
chosen appropriately. The conditions imposed in the enumeration 
problem usually amount to putting restrictions on the functions 
(e.g., requiring them to be one-to-one) and/or making rules as to 
when two functions are considered equivalent (e.g., when they are 
equivalent up to a permutation of the elements of X). For example, 
the binomial coefficient (") is the number of 1-1 functions from 
an m-set X into an n-set Y, where the elements of X are considered 
to be indistinct. 


Suppose that X and Y are two finite nonempty sets and Y* is the 
collection of functions f : X — Y. We wish to define some 


equivalence relations on Y* and, in each case, count the number of 

equivalence classes. When we say that two functions f and g(f, g € 

y) are equivalent, we will mean one of four things: 

(1) f=g. 

(2) f= gh for some bijection h: X > X. 

(3) f= ig for some bijectioni: Y — Y. 

(4) f = igh for some bijections h: X > X andi: Y — Y. 
(Note that the functions here are applied from right to left.) 

In definitions (2) and (4) we say that h delabels X, and we speak of X 

as an unlabeled (or delabeled) set. Likewise, in definitions (3) and (4) 


we say that i delabels Y, and we speak of Y as an unlabeled or 
delabeled set. 


EXAMPLE 2.9 
Consider the functions f and g of Figure 2.4. Are they equivalent? 


Figure 2.4 Two functions. Are they equivalent? 
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Solution: The functions fail to be equivalent under definition (1) 
because, after all, they are different functions. However, f= gh if h: X 
— X is the bijection given by h(a) = a, h(b) = c, and h(c) = b; therefore 
f and g satisfy definition (2). The bijection h rearranges the set {a, b, 
c}, eliminating the discrepancy between the two functions due to the 
labeling of the elements of the domain. If i is the bijection given by 
i(x) = x, i(y) = z, and i(z) = y, then f = ig; therefore f and g satisfy 


definition (3). It is now clear that the functions f and g of Figure 2.4 
are equivalent according to definitions (2), (3), and (4). 


EXAMPLE 2.10 
Are the functions f and g of Figure 2.5 equivalent? 


Figure 2.5 Equivalent functions when domain and codomain are 
unlabeled. 
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Solution: The functions f and g are equivalent only when both X and Y 
are unlabeled, i.e., according to definition (4). Define h: X — X by 
h(a) = c, h(b) = b, and h(c) = a, and define i: Y > Y by i(x) =z, i(y) = 
x, and i(z) = y. Then f = igh. 

The domain and codomain of a function may be labeled or unlabeled 
sets, leading to four types of functions to be counted. Furthermore, 
functions may be classified according to whether they are one-to-one, 
onto, both, or not necessarily either. Altogether there are 16 cases, as 
shown in Table 2.2. 


Table 2.2 The number of functions from X to Y, where |X| = m and |Y| =n. 


¥ 
labeled unlabeled 


P(n,m) mon 
labeled m>n 


onto T(m,n) onto 


bijections m!d(m,n) bijections 


mon 


man 
unlabeled m>n 


onto onto p(m,n) 


bijections bijections 4(m,n) 


We assume that f: X > Y isa function from a set X with m elements 
to a set Y with n elements. Let us consider the entries in the table, 
beginning with the X labeled, Y labeled box. 


X labeled, Y labeled 


We have already noted that the total number of such functions is n™. 
As for the second entry, every one-to-one (1-1) function 

corresponds to an ordered selection of m objects from the set Y. There 

are n choices for the first object selected, n — 1 choices for the second, 


and sO on, leading to the formula 
n! 


P(n,m) = n{(n—1)---(n—m+1)= m<n. 


(n —m)!’ 
If m > n, there are no one-to-one functions so we set P (n, m) = 0. 

In the case of bijections, we either have two sets of the same 
cardinality or we don’t. If m # n, then no bijection is possible. 


However, if m = n, then there are m! ways to match up the two sets. 
We define 6(m, n) to be 0 if m #n and 1 if m =n. Thus, the number of 
bijections is m!d(m, n). If either X or Y (or both) is unlabeled, then all 
bijections look the same, so the total number is 6(m, n). 


X unlabeled, Y labeled 


A one-to-one function f : X — Y is equivalent to a selection of m 
elements of Y without regard to order. The number of such selections 


C _ n! 
is the value of the binomial coefficient \’” mi(n — m)! 

A function f: X — Y is equivalent to a distribution of m identical, 
elements (the elements of X) into n different boxes (the elements of Y). 
A box may receive any number of objects, including zero. Suppose we 
represent the m identical objects with m copies of the symbol *. The n 
boxes are represented by a set of n— 1 vertical lines]. The placement of 
the objects in the boxes is indicated by a linear ordering of the *’s and 
|’s. For example, | || * means that the first box (to the left of the first |) 
contains three objects, the second box contains one object, the third 
contains no objects, and the fourth contains one object. The total 
number of items in the linear ordering is m + n — 1, and n of these are 

(" +n— ‘) 
|’s. Therefore, the total number of functions is \ ” — ! 
If f is onto, then each box must contain at least one object, and there 
are m — n objects to distribute freely. Hence, the number of onto 


‘es 5 ‘aes ) 
functions is n= 1 n—1 


Can you furnish a more direct combinatorial proof of this formula? 


X labeled, Y unlabeled 


If X is labeled and Y unlabeled, then the total number of functions 
from X to Y is the number of ways X may be partitioned into unlabeled 
parts (corresponding to the images under /f). We prefer to define a 
formula only when m < n, in which case the magnitude of n becomes 
unimportant (X can’t be divided into more than m parts). The Bell 
number B(m) is the number of such functions. 

If m > n, then one-to-one functions do not exist. However, if m < n, 


then all one-to-one functions look alike when X is unlabeled. This 
observation accounts for the 1—1 entries of the X unlabeled, Y labeled 
and X unlabeled, Y unlabeled boxes. 

The number of onto functions when X is labeled and Y is unlabeled 
is {™}, a Stirling number of the second kind. Clearly, 


m| _ T(m,n) 
nf nt ’ 


as dividing by n! permutes the labels of the n sets into which X has 
been divided. 


X unlabeled, Y unlabeled 
The total number of functions when X and Y are unlabeled and m < n is 
denoted p(m), and the values of p(m) are the partition numbers. The 
number of onto functions is the partition number p(m, n). 

Four relations are obtained by comparing the total number of 
functions in each box of Table 2.2 to the number of onto functions. 


Thus (2.27) j=l 
ae)? Sea) 
nin! ae 2 ra) 
B(m) = Al 
(2.29) jai S4 


pm) = >) p(m, j). 
(2.30) j=l 

The relation (2.27) follows from the fact that every function f: X — 
Y is onto some subset Y’ C Y of cardinality j, where 1 <j < m. The 
binomial coefficient (") “chooses” Y’ and T(m, j) counts the number of 
functions from X onto Y’. Relations (2.28) through (2.30) are proved 
similarly. 


EXERCISE 


2.41 Verify the formulas for the X unlabeled, Y labeled case in 
Table 2.2. 


2.42 Let |X| = m, |Y| =n. 
(a) How many possible relations are there from X to Y? 
(b) How many relations are there if X is unlabeled and Y is 


labeled? 

(c) How many relations are there if X is labeled and Y is 
unlabeled? 

(d) How many relations are there if both X and Y are 
unlabeled? 


Hint: In each case, the relation R: X ~ Y may be viewed as a 
function f : X — P(Y), defined by f(x) = R(x). Now use the 
techniques associated with the fundamental counting problem 
for functions. For example, the answer for (b) is ("*?"~"). 


2.43 Let |X| = m. An algebra on X is a subset S of P(X) with the 
following properties: 1. X € S. 

2.A€S implies X—-A€éeS. 

3.Ae€SandBeSimplyAUBéES. 

(a) How many algebras on X are possible if X is labeled? 

(b) How many if X is unlabeled? 
2.44 Let f be a random function from {1,...,n} to {1,...,n} and let 
r(n) be the expected number of elements in the range of f. Find 
limp = oo r(n)/n. 
2.45 How many onto functions f: X > X, |X| = m (X labeled), 
have the property that f(f(x)) = x for all x e X? Find a recurrence 
relation or an explicit formula. See Exercises 2.30 and 2.69. 
2.46 Give a combinatorial proof of the formula for the number of 
onto functions from an unlabeled set to a labeled set. 


2.5 Counting with symmetry 


We have classified functions f: X > Y (|X| = m, |Y| =n), where X and 
Y are labeled or unlabeled sets, and we enumerated these functions. 
Now we generalize the notion of labeled and unlabeled sets. For 
instance, recall that two functions f: X —- Y andg: X — Y are 
equivalent in the X unlabeled, Y labeled sense if there exists a bijection 
h: X —> X such that f = gh. The bijection h can be viewed as a 


permutation of X (in fact, any permutation in the symmetric group 
Sm-)- What happens if we restrict the permutations to a specified 
subgroup G of Sj)? If G is the identity group (e), for example, then we 
obtain the X labeled case; while if G = Sj, we obtain the X unlabeled 
case. Nontrivial subgroups G give rise to interesting intermediate 
cases. In these cases, Polya’s theorem for the number of inequivalent 
functions allows us to count quite complicated configurations, 
including nonisomorphic graphs. (See Section 3.3 for definitions about 
graphs.) More generally, if one group G acts on X and another group H 
acts on Y, then the number of inequivalent functions is given by de 
Bruijn’s formula, which enumerates more intricate structures such as 
self-complementary graphs. 
A group G is a nonempty set on which is defined a binary operation 

* satisfying the following three laws: 

1. For all x, y, z€ G, x * (y * z) = (x * y) * z (associativity). 

2. There exists an element e € G with the property that, for all x € 

G,x*e=e*x=x. 

3. For every x € G, there exists an element x © G with the 

property that x * x tax ley =e, 


The element e is called the identity of G. The element xt is called 
the inverse of x. One can easily prove that the identity element of a 


group is unique and that the inverse x”! of each element x is unique. 
We sometimes indicate a group by an ordered pair, e.g., (G, *), to 
emphasize both the set and the operation. We usually suppress the 


group operation sign and write xy for x * y. We abbreviate xx by x2, 


x yl by x2, etc. For any x € G, we set x9 =e, 


EXAMPLE 2.11 


The set Z is a group with respect to addition. 
A finite group is a group with a finite number of elements, and the 
order of a finite group is the number of elements in it. 
The cyclic group Zp, of order n, is the set {0,...,0 — 1} under the 


clock addition operation * defined by x * y=x+yifx+y<nandx*y 


=x+y-—nifxtye2n. 
A group G is abelian if xy = yx for all x, y € G. Otherwise, G is 
nonabelian. 


EXAMPLE 2.12 
The group Zp, is abelian. 
The order of an element x € G is the least positive integer n for which 


x'l = e, If there is no such integer, then we say that x has infinite order. 


EXAMPLE 2.13 

In Z4, the elements 0, 1, 2, 3 have orders 1, 4, 2, 4, respectively. 
The symmetric group Sp consists of the n! permutations of an n-set 
(e.g., N,,). The group operation * is composition of permutations. The 
elements of S, are conveniently written in cycle notation. Thus, 
(1 2 3)(4 8)(3 6 7)(5)(9) 
is the element of Sg which maps 1 to 2 to 3 to 1, transposes 4 and 8, 


maps 3 to 6 to 7 to 3, and fixes 5 and 9. To multiply two permutations 
together, we just compute the result of the composition of the two 
bijections (reading left to right). For example, 


(1 23)(45) *(1 2345) = (13 2 4)(5). 

Since (1 2)(1 3) 4 (1 3)(1 2), the symmetric group Spy, is nonabelian 
for alln > 3. 

A homomorphism from one group (Gj, *1) to another group (G2, 
*9) is a map f : Gy Gp which preserves multiplicative structure: f(x *1 
y) = f(x) *2 f(y) for all x, y € Gy. If the homomorphism is a bijection 
we Call it an isomorphism and say that the two groups Gj and G2 are 
isomorphic; we write G1 ~ Gp. A one-to-one homomorphism is called 


a monomorphism and an onto homomorphism is called an 
epimorphism. An isomorphism from a group G to itself is called an 
automorphism of G. 

Suppose that Gj and Gp are two groups. The product of G1 and G9, 
denoted G1 x G9, is the set of ordered pairs {(g1, 92): 91 € G1, 920 € 


G7} subject to the multiplication rule (g1, g2) * (g'1, g'2) = (919'1, 


929'2). 


EXAMPLE 2.14 
The product Z9 x Zp is a four-element group. It is not isomorphic 
to Z4, for Zp x Zp has three elements of order 2 and Z, has only 


one. 
We say that H is a subgroup of G if H is a subset of G and H is a 
group with respect to the group operation of G. We say that H is a 
normal subgroup of G if it is invariant under conjugation by elements 


of G, that is, xHx~! = H for all x EG. 


EXAMPLE 2.15 


The two-element group {(1 2)(3), (1)(2)(3)} is a subgroup of the 
six-element group S3. It is not a normal subgroup. Why not? 


The symmetric group Spy is especially important because every finite 
group is isomorphic to a subgroup of some Spy. This is Cayley’s 


theorem. 


Cayley’s theorem (1854). If G is a finite group of order n, then G is isomorphic to a 
subgroup of Sp. 
Proof. For each element g € G, we define a function fg: G > Gby the 
rule fg(@) = ag (right multiplication by g). Note that fg is a bijection 
because it has an inverse, namely, fg7- We check: (fg o fg- (a) = 


ag tg = a and (fg-} o fg)(@) = agg | = a. Since each fg is a 
permutation of the n-set G, we can define a map @: G — S, by 0(g) = 
fg: We claim that @ is a monomorphism. First we check that @ is an 
isomorphism: ¢(gh)(a) = fgh(a) = a(gh) = (ag)h = finlfg(@) = Cgfnd(a) 
= (d(g)(h))(a). Now we check that @ is one-to-one: If (g)(a) = O(h) 
(a), then fg(a) = f(a), which implies that ag = ah, and g = h. 

The dihedral group Dy, of order 2n, consists of the set of 


symmetries of a regular n-gon. If we number the vertices of the n-gon 
1,...,.n, then we see that Dy is a subgroup of Sy. Specifically, the 


subgroup is generated by two permutations: the rotation r = (1 2 3 4 


... n) and a flip f along an axis of symmetry of the n-gon. If n is odd 
we take the flip to be f = (n)(1 n— 1)(2 n- 2)(3 n-3)... (4 *#*) If 
n is even we choose f = (ln —1)(2n—2)...($ —1 4 +1). Each 
element of Dy, can be written in the form r@ p?, where ae {1, n—- 1} 
and b € {0, 1}. Two such elements are multiplied using the basic rules 


plese, f2 = e, and rf = fr 1, From these basic rules it is possible to 
compute all other products. We say that Dy has the presentation ({r, f : 


SAL f2 =1,rf= fr). For information about group presentations 
the reader should consult [17]. 
We have noted that every element of S, can be expressed as a 


product of cycles. A cycle of length 2 is called a transposition and a 
cycle of length 1 is called a fixed point. Cycles of length greater than 2 
can be written as products of transpositions. For example, (1 2 3) = (1 
2)(1 3). A permutation may be written as a product of transpositions 
and fixed points in more than one way. However, the number of 
transpositions is always even or always odd. A permutation is 
accordingly called an even permutation or an odd permutation. It 
follows from the first homomorphism theorem for groups, to be 
discussed shortly, that S, contains n!/2 even permutations and n!/2 
odd permutations. (This fact follows more simply from the observation 
that f(o) = (12)o is a bijection between the set of even permutations of 
Sn and the set of odd permutations of S,.) As the identity permutation 
is even and the even permutations are closed under multiplication and 
taking inverses, the even permutations are a group. 

The alternating group Ap is the group of even permutations of the 
set { 1,...,.n}. The group has order }n!. 

Let G be a group (with identity element e) and X a nonempty set. An 
action of G on X is a function 86: G x X > X which satisfies the 
following two conditions: 

1. For every x € X, we have O(e, x) = x. 

2. For every g, he G and x€ X, we have 6(g, 8(h, x)) = O(gh, x). 

For convenience, we write 6(g, x) as gx, so that the two conditions 
become: 


1. ex =x. 


2. g(hx) = (gh)x. 


Remember, however, that g and h are group elements and x is a set 
element. 

We note that each element g € G yields a permutation of the set X 
(defined by sending x to gx). For if gx = gy, then x = y (hence the map 


1 


is one-to-one) and gg ~ x = x (hence the map is onto). 


EXAMPLE 2.16 
The symmetric group Sy, acts on Ny, by the natural action gx = 
g(x), where g(x) is the image of x under the function g: Ny > Np. 


EXAMPLE 2.17 
The cyclic group Zp, acts on Ny, by 


g+2z ifgt+r<n 

gx = 

g+az—-n ifg+zr>n. 
Here g denotes the equivalence class representative of [g] that lies 
between 1 and n. 

If x € X, then the orbit of x (under the action 9) is orb(x) = {gx: g € 
G}. Orbits constitute equivalence classes of X; that is, X is partitioned 
into orbits. If there is only one orbit, then the action @ is transitive. 
Both examples above are transitive actions. 


To find the size of orb(x), note that gx = hx if and only if nt gx =x. 
Let Gy = {g ¢ G: gx = x}. We call Gy the stabilizer of x. We leave it 
to the reader to check that Gy is a normal subgroup G. Now gx = gy if 
and only if g € hG,, that is, if g and h are elements of the same coset of 
Gy. Therefore, the number of distinct values of gx is the number of 


lorb(z)| = (G : Gz] = eI : 
cosets of Gx: (2.31) iGz| 
Let G be a finite group. For each g € G, define fg: G > G by fg(x) = 


gxg t. We say that G acts on itself by conjugation. The stabilizer of 
an element x € G is called the centralizer of x and is denoted C(x). We 


call orb(x) the conjugacy class of x, and denote it by ccl(x). For the 


| | lccl(2)| = [6 : C(x) = OL 
conjugacy action, we have (2.32) IC(zx)| 
Two permutations are in the same conjugacy class if and only if they 
have the same cycle structure. Therefore, the number of conjugacy 
classes equals the partition number p(n), where n = |G]. 


Burnside’s lemma. Let G be a finite group acting on a finite nonempty set X. For 
each g € G, let fg ie the number of elements of X fixed by g. Then number of orbits n 


= qb 


is given by g€G 


Proof. The proof is a nice example of the technique of enumerating a 
set two different ways and comparing the results. The set is 
S={(g,2):g9 €G, x EX, and gr =}. 

On the one hand, by definition of fg, we have |S| = XgeG fg. On the 
other hand, counting from the perspective of the elements of X and 
letting =’ denote orbit representatives, we have 


S| a =: IG,| 


= lorb(z’)| 
= IQ opel 


= GIS 


=z 
= |GIn. 
Equating the two expressions for |S|, we obtain digeG fg = |G\n, 
from which the desired relation follows instantly. 


EXAMPLE 2.18 Average number of fixed points of a 
permutation 


How many fixed points has the average permutation of {1, 2, 3, 
sn}? Recall that we solved this problem using probability in 


Example 1.27. 
Solution: Applying Burnside’s lemma to the natural action of S,, on 


<> f=1. 


Np (which is transitive), we obtain 9€5Sn 


This is the average number of fixed points. 

We now assume that f: X —~ Yisa function from a set X of m elements 

to a set Y of n elements. In the terminology of Polya’s theory of 

counting, the elements of Y are labels, and f is a labeling of X. 

Suppose that a finite group G acts on X. We picture this situation with 
G 


i 
the following diagram: f: X— Y 
This action of G on X induces an action of G on the set of labelings yx 
as follows: (gf)(x) = f(g} x) for all x ¢ X. We check the two axioms 
for an action: 


1. (ef)(x) = fle} x) = f(x) for all xe X. 


2. (g(hf)(x) = (hg! x) = f(h+ x) = f(gh)! x) = (gh) f\(®) for x 
X. 


The set Y* of functions is partitioned into equivalence classes by this 
action. Functions in different equivalence classes are called G- 
inequivalent functions. By definition, the number of G-inequivalent 


functions is the number of orbits of the action of G on y* . 


Number of orbits theorem. When G (finite) acts on the set of functions yx (|X| =m, |Y| =n), 


1 3 nell) +e(2)+---+e(m) 


G| 
the number of orbits is (2.33) g€G 
where c(i) is the number of cycles of length i of g (regarded as a permutation of X). 


Proof. Suppose that g, when regarded as a permutation of X, has cycle 
structure c = (c(1), c(2),...,c(m)). The functions fixed by g are 
precisely those which are constant on each cycle. As there are c(1) + - 
-+ + c(m) cycles, each of which may be assigned one of n images, the 


number of functions fixed by g is fg = ne) ++ ° + + c(m). Our 
conclusion now follows directly from Burnside’s lemma. 


EXAMPLE 2.19 Stacks of chips 


How many stacks of 11 poker chips are possible with two colors 

of poker chips (red and white)? 
Solution: Let X be the set of 11 positions in the stack and Y = {red, 
white}. The group Sp = {e, t} acts on X, the identity e leaving the 
stack alone and Tt turning the stack upside down. We need to know the 
cycle structures of e and t. Certainly, e consists of 11 fixed points, so 
c(1) = 11 and c(i) = 0 for 2 <i < 11. Andt consists of one fixed point 
(the middle poker chip) and five transpositions, so c(1) = 1, c(2) = 5, 
and c(i) = 0 for 3 <i < 11. Therefore, the number of S9-inequivalent 


1 
~(211 4 96) — 1056. 
stacks is 3 | sik, 


EXAMPLE 2.20 Circular necklaces 


How many 10-bead circular necklaces may be made using 
black or white beads? 
Solution: Symmetries of the circular arrangement of beads are of six 
types: identity, rotations with one cycle, rotations with two cycles, 
rotations with five cycles, reflections through two opposite beads, and 
reflections through two opposite “sides?” The number of necklaces is 


l 5 is 
ah 2 +4 B+ 4-22 +2 45.2 45-2") = 78. 


While it is possible at this point to enumerate some rather 
complicated structures (nonisomorphic graphs, for instance), we 
prefer to do so only after developing some additional machinery— 
cycle indexes—in the next section. 


EXERCISE 
2.47 (a) How many conjugacy classes has Ss? 
(b) Let x = (1 2 3)(4)(5). Find |ccl|(x)| and C(x). 
2.48 Let Z,4 rotate a cube around an axis passing through the centers of opposite faces. 
Verify Burnside’s lemma for Xj = the set of vertices of the cube. 


X> = the set of faces of the cube, and 

X3 = the set of edges of the cube. 

2.49 (a) What is the average number of 1-cycles in the group A,,? 
(b) What about in D,? 


2.6 Cycle indexes 


As in the previous section, we let c = (c(1),...,c(m)) be the cycle 


structure of g € G when G acts on X (G finite, |X| = m). We assign to g 


e(1), x5). (mm) 
the monomial “1 ee 


where the x; are variables in a commutative ring containing the 
rational numbers. The vse index Z(G) 7 the average of these 
Z(G) = oq Do aia? 2k 

monomials: (2.34) géG 

The cycle index Z(G) contains complete information about the cycle 
lengths of the various permutations g of the group action. George 
Polya (1887-1985) chose the letter Z to stand for the German word 
Zycklenzeiger (“cycle indicator”). As Polya said, “The cycle index 
knows many things.” For instance, letting each x; = n (a substitution 


|G 


denoted Z(G)[x; <« n]), we obtain the formula for the number of G- 


inequivalent uncuigns me Y=, WY =) 235) 
Z(G){2; — nj = ai ue Tae 
gEG 


We next determine the cycle indexes of the most important group 
actions: Ej, Zp, Dn, Syn, An. The set acted upon is always X = {1, 


wilt. 

The identity group E,. The identity group consists of only the 
identity element e, which fixes every element of X. There are n 1- 
cycles, and the cycle index is (2.36) Z(E,) = zr}. 

The cyclic group Z,. Given g € G, x € X, the length of the cycle 
containing x is the minimum positive k for which gk + x =x (mod n), 
or gk 0 (mod n). Because k is independent of x, all cycles of g have 
length k. If g contains j cycles, then jk = n, from which it follows that 
k is a divisor of n and j = n/k. To determine the number of elements g 
corresponding to each value of k, observe that k'n/k has order k 
whenever gcd(k', k) = 1. There are (k) such values of k’, by 
definition of Euler’s @-function. As Xk\n (k) = n, there are exactly 


O(k) values of g for each “k/n. Therefore (2.37) 


; 1 n/k 

Z(Zn) = — » o(k)ap!™. 

The dihedral group Dy. As Dy contains Zp, as a subgroup, the cycle 
index of Dy will contain all the terms in the formula (2.37). The other 
elements of Dy are “flips” (reflections). If n is odd, each flip fixes one 
element of X and contains (n — 1)/2 transpositions. If n is even, half of 
the flips contain n/2 transpositions and half contain two fixed points 
and (n — 2)/2 transpositions. Putting these facts together, we obtain 


the formulas (2.38) 
1 (n—1)/2 
] 52124 (n odd) 
VA i. =—- —-&4 Zn n n—{ y 
(Dn) 2 (Zn) ‘! (x) 1a. a2 rh au) (n even). 


The symmetric group S,. A permutation g € S, can have any cycle 

structure Cc = (c(1),...,c(n)), where (2.39) 

Ie( 1) + 2c(2) +---+ neln) =n. 

The number of solutions to this equation is a good counting puzzle in 

its own right. Solutions may be generated by ordering the elements of 

X and partitioning the elements from left to right into cycles of the 

appropriate lengths. Each of the n! orderings of X gives rise to 

repeated solutions due to interchanging cycles and writing cycles 

down in more than one way. Because there are c(k)! ways to list 

cycles of length k and each cycle sy be written k ways, the number 
Th: 

of solutions is (2.40) ae) [Tas k°*) c(h)! 

The cycle index of the symmetric group is 


] c{1) 
Z(Sn) == Y A(c)a{? «-- 22), 
(2.41) rhe 


The alternating group A,. The alternating group consists of the 5n! 
even permutations of S,. When a permutation is written as a disjoint 


product of cycles, it is easy to tell whether it is even or odd. Because 
each odd cycle is equal to the product of an even number of 
transpositions, the number of odd cycles has no bearing on whether a 
permutation is even. However, each even cycle is equal to the product 
of an odd number of cycles. Hence, for a permutation to be even, it 
must be composed of an even number of disjoint cycles of even 


1 

lel Gy ee | abtels)+--+e(21n/2))) 

length. Therefore, 2 ( ue 

counts 1 for every even permutation and 0 for every odd permutation 
in S,. This establishes the cycle index for the alternating group: (2.42) 


1 ‘s : 
Z(An) _ = a h(c) (1 f, (=i) +400 ln) (1) i «eAlh) | 
c 


EXAMPLE 2.21 Circular necklaces (again) 


How many 11-bead circular necklaces can be made with two 
types of beads? 
Solution: The appropriate group is Dj 1 acting on the set X = {1, 


....L1}. Hence 
1 
Z(Du) = 32(Zi1) + 


kI11 
The only divisors of 11 are 1 and 11, for which (1) = 1 and (11) = 
10. Thus 
Z(Di1) = et + 1023;) + snitt. 
The number of necklaces that can be made with two types of beads is 
obtained by making the substitution 
| 


Z(Dy1)[2i — 2} — 59 (2, + 10-2) + 


= 126, 

Recall that in Example 2.19 we determined that there are 1056 
different stacks of 11 poker chips using two types of chips. The 
number of necklaces with 11 beads of two types is smaller because a 
necklace has more symmetries than a stack of chips. 


2.7 


Nile 


EXERCISE 
2.50 Calculate Z(Z3), Z(D3), Z(A3), and Z(S3). 


2.51 How many 10-bead circular necklaces may be made using three types of beads? 
2.52 Calculate the number of 100-bead circular necklaces that may be made using two 
types of beads. 

2.53 How many 100-bead circular necklaces can be made with 50 white beads and 50 
black beads? 


Hint: If you don’t see how to solve this now, consult the next section. 


2.54 How many ways may the eight vertices of a cube be colored with three colors (up 
to symmetry of the cube)? How many ways may the six faces be colored? How many 
ways may the 12 edges be colored? 


2.7 Polya’s theorem 


Polya’s theorem. Let G act on X and therefore on the set of functions f: X > Y (G finite, |X| = 
|Y| =n). Suppose that F(y(1),...,y(n)) is uns set of G-inequivalent functions for which |f ‘(y,)| = oe 
where 1 < < n. Then (2.43) 


ri bu = ‘: IF(y(1),---,y(n)) lyf + -yX, 


where the right-hand sum is taken over all n-tuples (y(1),...,y(n)) with 


Proof. We need to show that the coefficient of (1), - -- Ww), 
on the left side of the relation (2.43) is FOU)... .y(n))|. By 


\F(y(1),---+u(n))] = a » So» 


Burnside’s lemma, (2.44) geG 


where fg is the number of functions in F(y(1),...,y(n)) fixed by g. 
Suppose that g has cycle structure c = (c(1),...,c(n)). If the 
function 
f € F(y(1),...,y(n))} 
is fixed by g, then f is constant on each cycle of g; that is, each 
cycle of g lies entirely within one of the inverse images f liy,), 
Picture an m x 1 box (the elements of X) partitioned into n 
sections of sizes y(1),...,y(n) (the inverse images of f). Then fg is 
the number of possible packings of this box with c(i) blocks of 


sizes i, where 1 <i < m (the cycles of g). The polynomial on the 


left side of the relation (2.43) is 
c(k) 


c(1) : elm) ae k 
a pie Tm | “by =gril yy 
g€G k=1 \j=1 
ae us SOIR how the yo Gy terms are formed in the 


relation above. Suppose that each term (}*” j= Yj ‘\e( is expanded 
as a product of c(i) factors. Then each term in the product of these 
multiplicands is obtained by choosing a yi from each factor. The 
contribution to y¥‘") . .. 4” is the number of ways the exponents 
(c(i) units of size i) may be arranged to equal y(i) for each 1 <i < 


n. These arrangements are clearly equivalent to the box packings 
sescubed above. aa the coefficient of y¥‘!)...y4™ is 


ia ey fy =|F(y(1),---,y(n))I, 
as we Sere to show. 


EXAMPLE 2.22 Circular necklaces (yet again) 


Recall Example 2.20, which counted the number of circular 
necklaces with 10 beads, using black or white beads. How 
many such necklaces are made of seven white beads and 
three black beads? 


Solution: The ee index is 


Z(Di0) = mer +429 + 4x2 + 23 + Saixd + 523). 


20 


pane the substitution xj <« + yy + yl yields 
yi ty y2+5yfys+8y] y3 + 16y) y2 +1 6y} y2 + 16yty> +8yiu5+Sui ys ty y2t+y 
We see that there are eight necklaces with seven white beads and 


three black beads. Can you sketch these eight necklaces? 


EXAMPLE 2.23 Circular necklaces (one more time) 


Making the substitution xj < ai + 1 into Z(D11), we obtain 
the polynomial 
+a! + 5a® + 10a® + 20a" + 26a° + 26a° + 20a* + 10a* + 5a” +a +1. 
(We use the term a! + 1 instead of y', + y'5 because it is 
simpler.) This polynomial tells us the number of 11-bead 
circular necklaces made of two types of beads, according to 
the number of beads of each type. For instance, there are 20 
necklaces with seven white beads and four black beads. 


EXERCISE 


2.55 How many stacks of 10 poker chips contain four black chips and six white 
chips (up to inversion of the stack)? 


2.56 How many circular necklaces of 10 beads may be made from two types of 
beads with five of each type? 


2.57 How many 11-bead circular necklaces are there with three red beads, three 
white beads, and five blue beads? 


2.58 How many ways may eight identical markers be placed on an 8 x 8 square 
grid (up to a rotation of the grid)? 
2.59 How many ways may the vertices of an octagon be colored with three colors 
(up to symmetry of the octagon)? 


2.60 How many ways may the faces of a cube be colored using three colors (up to 
symmetry of the cube)? 


2.61 (a) How many ways may the 12 faces of a regular dodecahedron be colored 
with two colors? 


(b) How many ways with six faces of each color? 


2.62 How many ways can the faces of a regular icosahedron be colored with three 
colors (up to symmetry of the icosahedron)? 


2.8 The number of graphs 


In order to use Pélya’s theorem to count nonisomorphic graphs of 
order n, we need to determine the cycle index of the appropriate 
group action. A graph G = (V, E) may be identified with a 


function f : [v2 — {0, 1}, where f({x, s}) = 1 or O according to 
whether or not {x, y} is a member of E. We say that two graphs 
Gy, and Gp are isomorphic if there is a bijection between their 


vertex sets Vj to V5 that preserves adjacency. We normally 


regard two isomorphic graphs as the same graph. Two graphs are 
isomorphic if the corresponding functions are equal up to a 
permutation of V. As any permutation of V is allowed, the group 
acting on Vis S,, where n = |V|. This group induces an action on 


[v]2 by the rule {x, y} {gx, gy}. We call this action [Spl2. Our 
goal is to calculate the cycle index Z([Sp]2). 

Assume that g € Sy has cycle structure c = (c(1),...,c(n)). We 
determine the cycle structure of g as a permutation of [vi by 


assuming that {x, y} € [v2 and examining four cases. We call the 


cycles in [v2 pair-cycles. 

Case 1. Suppose that x and y are in cycles of different lengths a 
and b. There are c(a) choices for which cycle contains x and c(b) 
choices for which cycle contains y. Given these choices, the ab 
ordered pairs in the Cartesian product of the two cycles are 
partitioned into pair-cycles of length Icm(a, b). The number of 


such cycles is ab/Icm(a, b) = gcd(a, b). The contribution to 
c(a)c(b) gcd{a,b) 
lem(a,b) 


Z([SpJ2) is (2.45) t<a<b<n 
Case 2. Suppose that x and y are in different cycles of the same 
length a. There are (“e) ) choices for the cycles. The pair-cycle 
has length a, and the number of pair-cycles is a. The contribution 


to Z([SpJ) is (2.46) 1Sa<n 
Case 3. Suppose that x and y are in the same cycle of odd length 
a. There are c(a) choices for the cycle containing x and y and (a — 
1)/2 choices for the gap between x and y. The pair-cycles all have 
wcla){a—1)/2 
L 
length a. The contribution to Z([Sp]?) is (2.47) a odd 


Case 4. Suppose that x and y are in the same cycle of even length 
a. This case is the same as Case 3 except for one important 
difference. There are still c(a) choices for the cycle containing x 
and y. But now, although the typical pair-cycle has length a, and 
there are (a — 2)/2 choices for the gap between x and y, there is 
also the possibility that x and y are a/2 units apart in the cycle, so 
that the pair-cycle has length a/2. The contribution to Z([Sp]?) is 
efoile—aa 
(2.48) a even 
Combining the formulas (2.45), (2.45), (2.45), and (2.45), we 


obtain a formula for the cycle index of [Sp]: (2.49) 


. b) gcd(a,b e(a) 
Z((Sn]) = 5 De hte) I] ae (a,b) it S ) 


l<a<b<n l<a<n 
TY elev? TT xelovte-2)/2q040) 
a a aia 
a odd a even 
The cycle indexes for 2 <n < 4 are easily calculated by hand: 
1 
Z({S2]?) = =(2 
50) 21S!) = 52m) 
2 1 3 
(2.51) Z((S3]") = PACs + 32122 + 223) 
1 
(2.52) Z({S4]") = 5g (ti + 9x22? + 8x2 + 62422). 


Let’s examine the polynomial Z([S4]?). The 24 permutations of 


the set {1, 2, 3, 4} induce permutations of the six unordered pairs 
of elements. The identity permutation results in six fixed pairs. 
The six permutations of type 2 + 1 + 1 result in two fixed points 
and two 2-cycles in the set unordered pairs. The three 
permutations of type 2 + 2 result in two fixed points and two 2- 
cycles. The eight permutations of type 3 + 1 result in two 3- 
cycles. The six 4-cycles result in one 2-cycle and one 4-cycle. 


We apply Polya’s theorem to the cycle index Z([S4]?) to 
enumerate nonisomorphic graphs of order 4 by number of edges: 


(2.53) 
Z((S4]*)[xi — yi +yd] = yf +u2ye + 2ytyd +3y3y8 +2y2 ys +yyd +y8. 


The coefficient of yk y6-k, in this enumerating polynomial is the 


number of nonisomorphic graphs of order 4 with k edges and 6 — 
k non-edges. The symmetry in the polynomial comes from the 
fact that G and H are isomorphic if and only if their complements 
Gand # are isomorphic. 

We verify formula (2.53) by presenting in Figure 2.6 all 
nonisomorphic graphs of order 4 arranged by number of edges. 


Figure 2.6 Graphs of order 4 and their enumerating monomials. 
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The total number of nonisomorphic graphs of order 4 is 
Z([S4]”) [ai — 2] = Z([S4]”) [yn — 1, y2 — 1] 
=14+1+24+34+2+1+1 
= 11. 


The number of nonisomorphic graphs. The number of nonisomorphic graphs of order n is 


Z([Sn])[ai — 2]. 


Let g(n) be the number of nonisomorphic graphs on n vertices. 
Table 2.3 gives the value of g(n) for small n. We will show that 
Table 2.3 The number of nonisomorphic graphs. 
order number of graphs 
1 il 
a) 


156 
1044 
12346 
274668 
10 12005168 
9(2) 

g(n) ~~. 
Because 9(3) is the number of labeled graphs, this means that 
almost all graphs have no nontrivial automorphisms. In fact, g(n) 
~ ¢c(n), where c(n) is the number of nonisomorphic connected 
graphs of order n. See [14]. 


Oo wWON ODH fH WW 


Open problem. Find a formula for the number of nonisomorphic graphs of order n that contain a 
triangle. 


Open problem. Find a formula for the number of nonisomorphic planar graphs of order n. 


EXERCISE 


2.63 Use formula (2.49) to find the number of graphs of order 5 (up to isomorphism). 
2.64 How many ways are there to color the edges of Kg with three colors (up to 
isomorphism)? 

2.65 How many ways can the edges of the graph K3 3 be colored with nine colors (up 
to isomorphism of the graph)? 

2.66 Write the generating function for nonisomorphic multigraphs in terms of Z(S,). (A 


multigraph is a graph in which pairs of vertices can be joined by more than one edge.) 
2.67 How many nonisomorphic multigraphs of order 4 have at most five edges? 


2.68 Prove the identity 


1 mT 
Z(Sn) = = > 2y2(Sn=2).- 
k=] 


ae z : ies) CO) 
2.69 (a) Let Z(So) = 1. Prove the identity pal Z(S,) = exp ai xy, | ke 
(b) Let a, be the number of onto functions f : X > X, where |X| = n, with the 
property that f(f(x)) = x for all x € X. Show that the exponential generating function 
for the sequence {@,, } is pion anxz"/n! = exp(x + x? /2) 


Note that ay is the number of involutions of an n-set and also the 


number of n x n symmetric permutation matrices. See Exercise 2.45. 


2.9 Symmetries in domain and range 


Suppose that X and Y are finite nonempty sets (|X| = m, |Y| = n) acted 
upon by finite groups G and H, respectively. We picture this situation 
G H 


with the following diagram: J: x—yY 

We want to define what it means for two functions to be the same 
under these group actions. To this end, we define a new action H® on 
the set of functions Y* = {f: X > Y} as follows: if g ¢ G, h € H, and f 


< YX, then (h& f\(x) = hf (g-! x). (The reader should check that the 
two axioms for a group action are satisfied.) We say that two 
functions are equivalent if they are in the same orbit of the action and 
inequivalent otherwise. This definition is a generalization of the 
labeled/unlabeled set paradigm. If G is the symmetric group Sj, then 


X is unlabeled, and if G is the identity group Ej, then X is labeled. 
Likewise, if H is S,, then Y is unlabeled, and if H is Ep, then Y is 
labeled. For any groups G and H, Burnside’s lemma gives the number 


1 
— a Ag, h), 
ip ap> (9, h) 


of inequivalent functions: (2.54) EG heH 


where (2(g, h) is the number of functions fixed by hY. 

If f is fixed by h9, then f(g 1 x) = hf(x) for all x ¢ X. Thus f(x) = y 
implies that f(g + x) = hf(x) = hy, which in turn implies that f(g’ 2 x)= 
hy, In general, f(g x) = hi y. It follows that if x is in a cycle of 
length i in g and y is in a cycle of length j in h, then j divides i. 
Furthermore, we have shown that the correspondence between the 
two cycles is completely determined by the relation f(x) = y. There are 
j choices for the image of x. Suppose that g has cycle structure (c(1), 
...,c(m)) and h has cycle structure (d(1),...,d(n)). Then, given i and a 
particular cycle of length i in g, the number of fixed functions is 

mi(h) =~ jd(j), 
(2.55) il 
and the total number of fixed functions is 


29, h) = [] mi(h)™. 

{2756} i 

Formulas (2.54) through (2.56) combine to yield the following 
formula of N. G. de Bruijn (1918-2012). 


De Bruijn’s formula. If finite groups G and H act on finite nonempty sets X and Y, 
respectively, then the number of inequivalent functions is given by (2.57) 


1 
N= iA 2G — m;(h)}. 


We apply de Bruijn’s formula to the problem of counting self- 
symmetric graphs of order n, that is, graphs G for which G is 
isomorphic to its complement G. In the previous section we 


determined the cycle _ index Z([Sp]2) and the number of 


nonisomorphic graphs of order n, i.e., g(n) = Z([Sp])[xj — 2]. If [Spl 
acts on the set X = {1,...,n} and Sp acts on Y = 11, then functions 


correspond to nonisomorphic graphs where G and its complement G 
are regarded as the same. Hence, the number of such functions is 


1 
N(n) = 5 S> 2([Sn]?) [ae — mi(h)]. 
(2.58) hES2 
Let g. (n) be the number of self-complementary graphs of order n. As 


2N(n) counts nonisomorphic graphs with each self-complementary 
graph counted twice, we obtain the formula = (2.59) 


g*(n) = 2N(n) — g(n). 
For example, g (4) = 2N(4) — g(4) = 2: 6 — 11 = 1, and the unique 
self-complementary graph of order 4 is the path Py. 


EXERCISE 


2.70 How many self-complementary graphs are there of orders 5 and 6? 


2.71 Use a computer to calculate the number of self-complementary graphs of orders 7 
through 12. 


2.10 Asymmetric graphs 


Our formula for the number of nonisomorphic graphs of order n is 


1 
g(n) =— >_ h(c)2%), 
(2.60) eae 


where c = (C],...,Cy) is the cycle type of a permutation of {1,...,n}, 
the number of permutations of such a cycle type is (2.61) 


n! 
h(c) = =a: 
si [Tha k* ce! 
and 
q(c) = » H c+ >. K(S) _ S> ged(r, S)C5C5- 

(2.62) k k rcs 

We will prove that 
9(2) 
2.63) 9) ~ ar 


This result means that the typical unlabeled graph has no nontrivial 
symmetries and hence is an asymmetric graph. 
The lower bound is trivial: 


a 
ogi) 


nes we prove the upper bound. Assume that the permutation has n 
—j fixed points, where 0 <j <n. The case j = 0 gives the term 2(3) /n 
!. We will show that the other ‘om are bounded above by 
expressions which, upon division by 2(3 2) /nl, tend to0 asn > », 

We have 


q(c) < 2X 5 (k) + =e(k) (¢ (k) — ) ~ >_ min(r, s)e(r)c(s) 


<5 Lew? +E (TF*) crete 
1 
=5 > ke(k) > c(k) 


1 
= 50 »» c(k) 
The number of permutations with n —j fixed points is bounded above 
by 


! 
- Siete igh ee. 
n-j (n —j)! — 


The case j = 2 yields the exponent 


a(e)=1+ ("5 *) +1-(n-2)-1= (3) -n+2. 


Upon dividing by 9(3) /n!, the contribution is bounded by Qnt2+2 


log) 1 which tends to 0 asn > ©. 
Now let’s look at the case j > 3, so that 1 — j/2 < O. We obtain 


. rr J 
— n— {= = oe ~ 


and 


1 j n 1 j 
yc _= = _ —_=—)}. 
q(c) < 5” (x x) ) + 5” ¢ x) 
Upon dividing by 2(2) /nl, the contribution is bounded by 22 


n(1-j/2)+j logy 1. which tends to 0 asin — ©. 


Erdgs and Rényi theorem (1963). Almost all unlabeled graphs are asymmetric: 


EXERCISE 


2.72 Find an asymmetric graph with six vertices. Show that no graph with five vertices 
is asymmetric. 


2.73 Use a computer to compare g(n) and 9(2) /n' for 2<n< 20. 
2.74 Estimate g(100). 


Notes 


The notation for falling factorial and rising factorial varies 
considerably. The falling factorial function is often written (x)p. 


James Sylvester (1814-1897) introduced the notion of Ferrers 
diagrams in 1853. Apparently they were discovered by Norman 
Ferrers (1829-1903). 


G. H. Hardy and Srinivasa Ramanujan (1887-1920) were the first 
mathematicians to find an explicit formula for p(n). The most 


elementary asymptotic estimate is loge p(n) ~ m(2n/3)¥/ 2 See [12] for 


details. 

In 1937 George Polya (1887-1985) published the formula for 
enumerating graphs in connection with a problem concerning the 
number of chemical isomers. The language of his counting theory was 
quite descriptive. A function from X to Y is called a configuration. 
The elements of X are places and the elements of Y are figures. The 
store inventory is ae ys and the pattern inventory is 
Z(G)|[4i <— ae yi}. Thus the basic problem is to find the number of 
G-inequivalent patterns. An alternative approach to the Polya theory 
of counting was undertaken independently by John H. Redfield 
(1879-1944) in 1927. 


CHAPTER 3 


THE PIGEONHOLE PRINCIPLE 


The pigeonhole principle is an important key to solving many problems in 
combinatorics. In this chapter, we discuss several versions of the pigeonhole 
principle and give many applications. 


3.1 The principle 


The most important theorem in existential combinatorics is also the simplest: the 
pigeonhole principle. It occurs in many variations, a few of which we discuss 
here, and says that not every element in a set is below average and not every 
element is above average. We now State and prove a general version. 


Pigeonhole principle. If f : A — B is a function, with A and B finite 
nonempty sets, then the following two statements hold: 

(1) ‘There exists b < B with |f + (b)| > |AI/|BI. 

(2) There exists b ¢ B with |f + (b)| < |AI/|BI. 


Proof. We prove (1) by contradiction. Suppose that |f~ 1 (b)| < |A|/B| for all be 
B. Then 
|Al 


Al = So If TOI < ip Bl = lA! 
beEB 


We conclude that |A| < |A|, an absurdity. Therefore, our assumption that |f~ 1 (b)| 


< |A|/|B] for all b € B is false. Hence, there exists b € B with |f— 1 (b)| > |A|/|B| for 
all. 
We prove (2) by replacing “<” by “>” in the above argument. 


EXAMPLE 3.1 Sets of cards 


A popular board game features cards of three suits: cannon, horse, and 
soldier. A “set” consists of three horses, three soldiers, three cannons, or one 
card of each suit. It is possible to have four cards without possessing a set, 


e.g., two horses and two soldiers. Prove that any five cards contain a set. 
Solution: Let the three suits be designated by C, H, and S. If the five cards do 
not include one card of each suit, then at least one suit is absent, say S. 
Therefore, we may define a function f: {a, b, c, d, e} — {C, H} from the five 
cards to their respective suits. (The function isn’t necessarily onto.) By the 
pigeonhole principle, the preimage of one suit contains at least three cards. 
These cards constitute a set. 


Nonuniform pigeonhole principle. If f: A — B is a function from a finite 
nonempty set A to an n-set B = {bj, by}, then the following two 


statements hold: 
(1) If |A] = (ay -1) +--+ (ay—1) + 1, then |f_! (bp) = aq; for some i. 
(2) If JA] = (a, + 1) +--+ +(a@q—-1)—1, then |f (b))| < q; for some i. 


Proof. (1) If the inequality holds for no i, then 


|A| = 551471 (bi)| $ Soa - 1) < Al, 
i=1 i=l 


a contradiction. 
(2) is proved similarly. 
If |A] = |B] + 1, then the following special case of the pigeonhole principle 
results. 


Pigeonhole principle (special case). If f: A — B is a function and |A| = |B 
+ 1, then there exists b € B with |f I(p)| > 2. In other words, f(a) = f(a2) 
for some distinct aj, a7 € A. 


This version of the pigeonhole principle is often paraphrased as: “If n + 1 
objects are placed in n pigeonholes, then at least one pigeonhole must contain at 
least two objects.” Hence the term pigeonhole principle. 

We now give a pigeonhole principle proof of a very old but interesting result 
in number theory. For different proofs, see [21]. 


Approximation theorem. For any real number a and n€N, there exist ; 


integers p and q with 1 <q < nand |a— p/g| < 1/qn< 1/q?. 


Proof. Define f: Nniai — {[0, + (4, 2),...,[4=4,1)} by letting f(j) be the 
subinterval of [0, 1) which contains ja — [ja]. (Note that [x] is the greatest 
integer less than or equal to x.) The pigeonhole principle implies the existence of 
j, kG > k) with fG) = f(k). According to the definition of the function f, there 
exists a positive integer p with |ja — ka — p| < 1/n. Letting q = j — k (so that 1 < q 
<n), the inequality becomes |a — p/q| < 1/qn < 1/2. 

This theorem is used to ensure good rational approximations to irrational 
numbers a. For example, taking a = m and n = 10, the theorem guarantees the 
existence of a rational number p/q with | — p/q| < 1/10q < 1/q?. In fact, the well- 
known approximation 22/7 satisfies the inequality. Can you find an 
approximation to a = /2 with n = 10? 

In the preceding versions of the pigeonhole principle we have assumed that 
both the domain and the codomain are finite sets. If the domain is infinite, then 
the following highly useful infinitary pigeonhole principle results. 


Infinitary pigeonhole principle. If f; A — B is a function from an 
infinite set A to a finite set B, then there exists b € B with f— 1 (b) infinite. 


EXERCISES 
3.1 Using the pigeonhole principle, show that some positive integral power 
of 17 ends in 0001 (base 10). 
3.2 Let q be an odd integer greater than 1. Show that there is a positive 
integer n such that 2! — 1 is a multiple of q. 
3.3 Let Aj,...,4199 be subsets of a finite set S, each with |Aj| > 2 |S|. Prove 
that there exists x € S with x contained in at least 67 of the Aj. Show that 67 
is the best possible result. 
3.4 Prove that if S is a subset of {1,...,2n} and |S| > n, then there exist x, y € 
S with x and y relatively prime. 
3.5 Prove that if S is a subset of {1,...,2n} and |S| > n, then there exist x, y € 
S with x a divisor of y. 
3.6 (Putnam Competition, 1964) Let S be a set of n > 0 elements, and let Aj, 
Ap,.--,Ak be a family of distinct subsets, with the property that any two of 
these subsets meet. Assume that no other subset of S meets all of the Aj. 


Prove that k = 27-1. 
This result is generalized in many interesting ways in [3]. 
3.7 Prove the infinitary pigeonhole principle. 


Hint: Assume that f~ 1(p) is finite for all b € B and obtain a contradiction. 
3.8 Prove that every polyhedron has two faces with the same number of 
edges. 
3.9 Let A be an m X n matrix with distinct real number entries in increasing 
order in each row from left to right. Rearrange the elements of each column 
of A so that they are in increasing order from top to bottom; call the resulting 
matrix A’. Show that the elements of each row of A’ are in increasing order 
from left to right. 
3.10 Ann x n binary matrix contains a 1 in every row, column, and diagonal 
(diagonals of every length are considered here). What is the minimum 
number of 1's in this matrix? 
3.11 Let L; be a two-row array of positive integers 

Q,; a@Qq ... Qm 

a 
where the qj; are distinct integers written in increasing order. Let cy,...,Cy (C41 
< Cc? <*** Cp) be the list of all integers that occur in Lj, and for 1 <i<™m, let 
d; be the number of occurrences of c; in L1. Let L2 be the array 

Cj Ca wos Cy 

a) a 
For example, if Lj is the array 

Ir2zeee Ww 

331441 3 «6 
then Lp is the array 

123 4 5 6 8 10 11 


Sb PE eee E Ts 
Starting with any array Ly, the array L> is created as indicated above. Then the 


operation is repeated on L7 to form a new array L3, and so on. Show that the 


number of distinct arrays produced in this manner is always finite. We say that 
each sequence of arrays eventually “goes into a loop.” Show that a loop always 
consists of one, two, or three arrays. 


3.2 The lattice point problem and 
SET® 


In this section, we apply the pigeonhole principle to problems concerning lattice 
points in Euclidean space. 

A lattice point in the plane is an ordered pair p = (x, y) with integer coordinates 
x and y. 


EXAMPLE 3.2 A lattice point midpoint 


Let pj, P2, D3, P4, Ps be five lattice points in the plane. Prove that the 
midpoint of the line segment p;p; determined by some two distinct lattice 
points p; and p; is also a lattice point. 

Solution: Define a function 


f: {p1, p2,P3, Pa, ps} —> {(0, 0), (0,1), (1,0), (1, 1)} 
by mapping pj; to the ordered pair (x; mod 2, yj; mod 2). By the pigeonhole 


principle, some two points p; and p; have the same image. These points satisfy 
the requirement of the problem, for the midpoint of pjpj is ((xj + x)/2, (i + 
yj)/2). Both coordinates are integers because xj, Xj have the same parity and y,, Yj 


have the same parity. 

The “five” in the problem above is best possible in the sense that one can find 
four lattice points determining no lattice midpoint, e.g., (0, 0), (0, 1), (1, 0), (1, 
1). 


EXAMPLE 3.3 A lattice point centroid 
The centroid of three points p; = (xj, yj), Pi = (Xi Yi)» Pk = Xk Yk) iS (Xj + Xj 
+ Xk)/3, (Vj + Yi + Yk)/3). What is the minimum number n of lattice points, 
some three of which must determine a lattice point centroid? 
Solution: We show that n < 13 by defining a function f: {pj,...,.p73} > {0, 1, 2} 
which maps pj; to the residue class modulo 3 of its first coordinate. By the 
pigeonhole principle, some five lattice points, say pj, D2, P3, P4, P5, have the 
same image. By the analysis of the suits in the board game mentioned in 


Example 3.1, some three of these points, say pj, p2, p3, have second coordinate 


residues 000, 111, 222, or 012. These three lattice points determine a lattice 
point centroid. Therefore, n exists and satisfies n < 13. 
The determination of the minimum n that forces the existence of three points 
determining a lattice point centroid is a more difficult matter. It turns out that the 
minimum value is n = 9. The argument is carried out modulo 3. Thus, there are 
just nine possible ordered pairs from which to choose. The following list of 
ordered pairs (modulo 3) shows that n > 8: 

(6,0), (0,0), (0), 1,0), 0,1), 6,2), 1), 41,4)- 
In order to prove n < 9, we must show that any nine points include three whose 
first coordinates and second coordinates are of the form 000, 111, 222, or 012. 
The nine possible ordered pairs are conveniently represented by the nine non- 
ideal points of the order 3 projective plane of Figure 6.1. The 12 lines of the 
figure (excluding the line at infinity) correspond to triples of points which 
determine a lattice point centroid. If any ordered pair is chosen three times, these 
three ordered pairs determine a lattice point centroid. Therefore, let us assume 
that each ordered pair is chosen at most twice, and hence at least five different 
ordered pairs are chosen. By shuffling the rows and columns of the figure (if 
necessary), we may assume that three of the points are (0, 0), (1, 0), and (1, 1). If 
no three points are collinear, then we may not choose the points (2, 0), (0, 2), or 
(2, 2). This means that we must choose two of the three points (1, 1), (2, 1), and 
(1, 2). But any of these choices gives three collinear points: (0, 1), (1, 1), and (2, 
1); (1, 0), (1, 1), and (2, 1); or (0, 0), (1, 2), and (2, 1). 

The d-dimensional generalization of the above problem calls for the minimum 


n such that, given any n lattice points in Rd (ordered d-tuples of integers), some 
k determine a lattice point centroid. Let n(k, d) be the minimum such n. The 
existence of n(k, d) is guaranteed by the pigeonhole principle, and, in fact, the 
pigeonhole principle yields an upper bound for n(k, d) on the order of kt+1 The 
set of d-tuples each of whose coordinates is 0 or 1, taken with multiplicity k — 1, 


establishes the lower bound n(k, d) > (k — 124. 
Open problem. Find a formula for n(k, d) for all k, d. 


This question is known as the lattice point problem. Trivially, n(1, d) = 1. In 
1961 Paul Erdés, A. Ginzburg, and A. Ziv proved that n(k,1) = 2k — 1. It has 
since been shown that n(3, 3) = 19; she Problem 6298, American Mathematical 


Monthly 89 (1982) 279-280. In 2003 Christian Reiher and Carlos di Fiore 


proved that n(k, 2) = 4k — 3. An exercise calls for a proof that n(2, d) = 2d 4 4. 

A situation similar to the lattice point problem comes from a card game called 
SET®, created by the population geneticist Marsha Falco. The game is played 
with cards identified by four attributes: number, shape, color, and shading. Each 
attribute occurs with three possible values (e.g., shape = oval, diamond, or 
squiggle). Hence there are 81 cards. Thinking of the attributes as coordinates and 
the values as residue classes, the cards are represented as 4-tuples over Z3, e.g., 
(0, 1, 0, 1), (2, 1, 0, 2), and (1, 1, 1, 1). A “set” consists of three cards that, with 
respect to each attribute, all agree or all disagree. For example, the cards 
corresponding to (0, 1, 1, 2), (0, 2, 0, 2), and (0, 0, 2, 2) constitute a set. The 
game is similar to the lattice point problem with k = 3 and d = 4, but in the game 
no 4-tuple of coordinates is repeated. One can define a generalized version of the 
game in which there are d attributes each occurring with k possible values. 


Hence, there are k@ cards. A set consists of k cards that all agree or all disagree 
on each attribute. Equivalently, a card is a vector with d coordinates, each of 
which can take k values. A set is a collection of k vectors, which in each 
coordinate all agree or all disagree. Then we can define the generalized SET® 
function n'(k, d) to be the minimum number of cards (or vectors) that guarantee a 
set. Little is known about the function n'(k, d). The values n'(3, 4) = 21, n’ (3, 5) 
= 46, and n'(3, 6) = 113 are known. A collection of 20 cards containing no set is 
shown in Figure 3.1 (the four-dimensional space is represented by a three-by- 
three array of three-by-three arrays). Information on SET may be found at 
http://www.setgame.com. 


Figure 3.1 Twenty cards containing no set. 


The functions n(k, d) and n'(k,d) are similar, although there are two 
differences. A minor difference is that in the card game each d-tuple occurs 
exactly once while in the lattice point problem repetition is allowed. A more 
important difference is that in the lattice point problem we require that a sum of 
lattice points be zero modulo k, while in the card game we require that the cards 
all agree or all disagree in each coordinate. For example, the lattice points (1, 0), 
(1, 1), (3, 0), and (3, 3) have a lattice point centroid, but the corresponding cards 
do not constitute a set (with k = 4). If k = 3, this difference disappears. 

We mentioned that the value n(3, 3) = 19 has been proved. The corresponding 
fact in the card game is that n'(3, 3) = 10. 

In the card game setting, there are 27 possible cards (each defined by three 
attributes that occur with three possible values). We would like to show that 
every collection of 10 cards contains a set. To translate back to the lattice point 
problem, assume that we have 19 ordered triples of numbers modulo 3. If no 
triple is repeated three times, then, by the pigeonhole principle, there must be at 
least 10 distinct triples. An exercise calls for a construction to show that n(3, 3) 
> 18 (and incidentally n’(3, 3) > 9). 

It is convenient to represent the cards both as integers between 0 and 26 
(inclusive) and as 3-tuples of elements in Z3. We define these 3-tuples as the 


base 3 digits of the numbers. 

The number of 10-subsets of a 27-element set is Gal = 8,436,285, and this is 
too many subsets to test. We will discuss a way to reduce the number of 10- 
subsets which must be examined. 


Without loss of generality, we may assume that 0 = (0, 0, 0) is an element of 
each 10-subset. Also, we note that in any 10-subset of Z33 there exist three 


linearly independent vectors (a plane contains only nine points). By a linear 
transformation (a 3 x 3 matrix), we may map these three vectors to the specific 
vectors 1 = (0, 0, 1), 3 = (0, 1, 0), and 9 = (1, O, 0). Therefore, since linear 
transformations map sets to sets, we need only consider 10-subsets that contain 
these three points. Furthermore, each pair of the four points 0, 1, 3, 9 determines 
a third point which forms a set with that pair. There are (5) = 6 such points, 
namely, 2 = (0, 0, 1), 6 = (0, 2, 0), 18 =(2, 0, 0), 8 =(0, 2, 2), 20 = (2, 0, 2), and 
24 = (2, 2, 0). If any of these points is present in a 10-subset, then the 10-subset 
contains a set and we do not need to examine it. We can therefore exclude these 
six points from consideration. 

To summarize, of the integers 0 through 26 (inclusive), the integers 0, 1, 3, and 
9 are included in every 10-subset, and the integers 2, 6, 8, 18, 20, and 24 are 
excluded in every 10-subset. Hence, we now have only (*4547°) = (%) = 
12,376 subsets to examine. 

A computer run can check these subsets and thereby show that every 10-subset 
contains a set, and therefore n'(3, 3) = 10 and n(3, 3) = 19. 


Open problem. Find n'(4, 3). 


EXERCISES 


3.12 (Putnam Competition, 1971) Let there be given nine lattice points 
(points with integer coordinates) in three-dimensional Euclidean space. 
Show that there is a lattice point on the interior of one of the line segments 
joining two of these points. 


3.13 Prove that n(2, d) = 2d 4 4. 
3.14 Prove that n(3, 3) > 18. In fact, n(3, 3) = 19. 


3.15 Prove that (k — 1)24 + 1 < n(kd) < (k—1)k¢ +1 


3.16 Prove that if k is a power of 2, then n(k, d) = (k—- 1)24 tes 
3.17 Prove that n(3, d) = 2n' (3, d) — 1. 


3.18 Prove that n1(3, d + 1)— 1 > 2(n'(3, d) — 1). 
3.19 Prove that (k — 1)? < n'(k, d) < k@. 


3.20 Prove that n'(k, 2) = (k—1)? +1. 


3.21 Prove that given any nd —nd-1s 4 d-tuples from the set S = {1, 2, 
...,}, there exist n which, in each coordinate, are a permutation of S. Show 


that the result is not true for n? — n?¢71 d-tuples. 


3.3 Graphs 


Graph theory began in 1736 when Leonhard Euler (1707-1783) solved the 
famous concerning a certain system of seven bridges over the river Pregel. See 
[16]. In the last 50 years there has been an explosion in graph theory research 
and applications. Today, there are many areas of graph theory research including 
algebraic graph theory, extremal graph theory, and topological graph theory. 
Within combinatorics, graph theory is closely related to design theory, Ramsey 
theory, and coding theory. In this section we give some basic definitions and an 
indication of the deeper results of graph theory which will be studied in the next 
chapter. 

A graph G is an ordered pair (V, E), consisting of a vertex set V and an edge 


setE € [vi2. Vertices are also called points or nodes. Edges are also called lines 
or arcs. In our definition of graph there are no loops or multiple edges. In a 
drawing of a graph, two vertices x and y are joined by a line if and only if {x, y} 
€ E. Two vertices joined by a line are said to be adjacent; if they are not joined 
by a line, they are nonadjacent. If |V| = p and |E| = q, then we say that G has 
order p and size q. The degree of a vertex v, denoted d(v), is the number of edges 
incident to v. The complement G of a graph G has the same vertex set as G, but 
two vertices are adjacent in Gif and only if they are nonadjacent in G. 

Certain graphs occur so frequently that they require names. The complete 
graph Ky, consists of n vertices and all possible edges. The complete 


bipartite graph Km, n consists of a set A of m points, a set B of n points, and all 
the mn edges between A and B. The infinite complete graph Kg contains a 


countably infinite set of points and all possible edges. Likewise, the infinite 
complete bipartite graph Koo, o contains countably infinite sets A and B and all 


edges between A and B. The cycle Cy, consists of n vertices connected by n edges 
in a cyclical fashion. The path Py is Cy, minus an edge. Figure 3.2 illustrates 
some of these graphs. For general references on graph theory, see [13], [5], and 


[28]. 


Figure 3.2 A complete graph, a complete bipartite graph, a cycle, and a path. 
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One of the most elementary propositions of graph theory is called the 
“Handshake Theorem.” If some people in a group shake hands, then there will 
be two people who shake the same number of hands. 


Handshake Theorem. In any graph G with a finite number of vertices, 
some two vertices have the same degree. 


Proof. Suppose that G has p vertices. Then each vertex has degree equal to one 
of the numbers 0,...,p — 1. However, it is impossible for G to have both a vertex 
of degree 0 and a vertex of degree p — 1. Therefore the list of degrees of the p 
vertices contains at most p — 1 different numbers. By the pigeonhole principle, 
some two vertices have the same degree. 

If 6(v) is the same for all vertices, then we say that G is regular of degree 6(v). 
Note that complete graphs and cycles are regular. 

If G is any finite graph, the independence number a(G) is the maximum 
possible number of pairwise nonadjacent vertices of G. The chromatic number 
x(G) is the minimum number of colors in a coloring of the vertices of G with the 
property that no two adjacent vertices share the same color. 

Here is another simple theorem proved by the pigeonhole principle. 


Theorem. In any graph G with p vertices, p < a(G)y(G). 


Proof. Consider the vertices of G as partitioned into x(G) color classes. By the 
pigeonhole principle, one of the classes must contain at least p/y(G) vertices, and 
these vertices are pairwise nonadjacent. Thus a(G) = p/x(G) and the result 
follows immediately. 

Equality in the above theorem holds, for example, when G consists of the 


vertices and edges of a cube. 

The famous “four color theorem,” proved in 1976 by Kenneth Appel and 
Wolfgang Haken, is the statement that x(G) < 4 for any planar graph G. (A graph 
is planar if it can be drawn in the plane without edge crossings.) Combining this 
result with the theorem on the independence number and chromatic number of a 
graph, we arrive at the relation a(G) = p/4 for any planar graph G. We can tum a 
planar graph into a planar map by placing a territory at each vertex and allowing 
two territories to share a common boundary when the two vertices in the graph 
are adjacent. In terms of maps, Appel and Haken’s result is that every map can 
be colored with four colors so that no two bordering territories have the same 
color. It follows from the theorem on the independence number and chromatic 
number of a graph that any planar map on p vertices contains at least p/4 
territories no two of which share a border. 

A path in a graph G from vertex vg to vertex vy, is a sequence of distinct edges 


{vp, vi}, {v1, v2},.-., {Un—1, Un}- 
The path is simple if the vertices vq, v9,...,Vy, are distinct. A circuit is a path from 
v to v for some vertex v. A simple circuit is a cycle. We say that G is connected 
if there is a path between every two vertices. Note that each of the graphs in 
Figure 3.2 is connected. 

The next result, a special case of Turan’s theorem called Mantel’s theorem, 
foreshadows Ramsey’s theorem of the next chapter. 

We consider an extremal property of graphs. How many edges are possible in 
a triangle-free graph G on 2n vertices? Certainly, G can have n2 edges without 
containing a triangle: let G be the complete bipartite graph Ky pn, consisting of 


two sets of n vertices each and all the edges between the two sets. Indeed, n2 


turns out to be the maximum possible number of edges. That is, if G has n2+1 
edges, then G contains a triangle. This we prove by mathematical induction 
using the pigeonhole principle. 


Mantel’s theorem (1907). If a graph G of order 2n has n2+1 edges, then 
G contains a triangle. 


Proof If n = 1, then G cannot have n2+1 edges; hence the statement is 
vacuously true. Assuming the result for n, we now consider a graph G on 2(n + 


1) vertices with (n + 1/2 + 1 edges. Let x and y be adjacent vertices in G, and let 


H be the restriction of G to the other 2n vertices. If H has more than n2 edges, 


then we are finished by the induction hypothesis. Suppose that H has at most n2 
edges, and therefore at least 2n + 1 edges join x and y to vertices in H. By the 
pigeonhole principle, there exists a vertex z in H that is adjacent to both x and y. 
Hence G contains the triangle xyz. 


Theorem. Up to isomorphism, Kp, p, is the only triangle-free graph with 


2n vertices and n2 edges. 


Proof. The argument uses induction on n and the previous theorem. For n = 1, 
the result is trivially true. Assume the result holds for n. Let G be a graph on 2(n 


+ 1) vertices with (n + 1/2 edges and no triangle. Let u and v be connected 
vertices in G. Let H be the graph restricted to the other 2n vertices. By the 


previous theorem, H has at most n2 edges. However, if H has less than n2 edges, 
then there are more than 2n edges between the set {u, v} and H; by the 


pigeonhole principles, there exists a triangle. Hence, H has exactly n2 edges, H 
is isomorphic to Kj py, and there are exactly 2n edges from {u, v} to H. The 
reader can now show that u and v are each joined by n edges to H, with u joined 
to one independent set in H and v joined to the other. Therefore, G is isomorphic 
to Kn+1,n+1- This completes the induction and the proof. 


EXERCISES 


3.22 Show that the theorem on the independence number and chromatic 
number of a graph does not hold for infinite graphs. 

3.23 Find two nonisomorphic graphs with p = 12, a(G) = 3, and x(G) = 4. 
This shows that the upper bound of the theorem on the independence 
number and chromatic number of a graph may be met by nonisomorphic 
graphs. 

3.24 Use the infinitary pigeonhole principle to prove that if G is a countably 
infinite graph, then at least one of a(G) and x(G) must be infinite. 

3.25 Prove that if G is a graph with 6(v) = p/2 for every vertex v, then G is 
connected. 

3.26 (G. A. Dirac, 1952) Show that under the hypothesis of the previous 
exercise, G contains a subgraph isomorphic to Cp: Such a subgraph is called 


a Hamiltonian circuit, after the mathematician William Rowan Hamilton 
(1805-1865). 

3.27 A tree is a connected graph with no cycles. Prove that in a tree with p 
vertices and q edges, p= q+ 1. 

3.28 A graph is called “cubic” if every vertex is of degree 3. Prove that the 
edges of any cubic Hamiltonian graph (one with a Hamiltonian circuit) can 
be colored with three colors so that no two edges of the same color have a 
common vertex. 


3.4 Colorings of the plane 


The concept of this section, partitions of the plane, foreshadows Van der 
Waerden’s theorem of Chapter 4. 

Suppose that the plane is partitioned into two (disjoint) subsets G and R (green 
and red). We will show that one of the two subsets contains the vertices of a 
Euclidean rectangle with sides parallel to the coordinate axes. In fact, 
partitioning the whole plane is unnecessary. The same result follows if we 
partition just the 21 lattice points of N7 <x N3 into two subsets, so let us assume 


only that. We say that each lattice point is “colored” either G or R. 


Theorem. If the 21 lattice points of N7 x N3 are colored G and R, there 


exist four points, all the same color, lying on the vertices of a rectangle 
with sides parallel to the coordinate axes. 


Proof. Each column of three points in this lattice contains three points of color 
G, two G's and one R, two R's and one G, or three R's. For the moment, the 
relevant fact is that there is a majority of G's or a majority of R's in each column. 
Let us refer to a column as a G-majority or an R-majority column. By the 
pigeonhole principle, some four columns are G-majority or some four columns 
are R-majority. Without loss of generality, suppose that there are four G- 
majority columns. We will show that there are four points all colored G which 
are the vertices of a rectangle. If any of the four G-majority columns contain 
three points colored G, then we handicap ourselves by changing the color of an 
arbitrary point to R. (Our result will follow even with this handicap.) Now we 
have four columns which each contain exactly two points colored G and one 
point colored R. There are three possible patterns for the configuration of the 


points: GGR, GRG, and RGG. By the pigeonhole principle, there are two 
columns with the same color pattern. The four G's in these columns are the 
vertices of a rectangle with sides parallel to the coordinate axes. 

It is easy to see that neither the lattice N7 <x No nor the lattice Ng x N3 is 
sufficient, when 2-colored, to force the existence of four monochromatic points 
on the vertices of a rectangle with sides parallel to the coordinate axe’s. Thus, 
the lattice N7 x N3 is minimal with respect to this property. 


EXERCISES 

3.29 Exhibit 2-colorings that show that neither the lattice N7 < No nor the 
lattice Ng <x N3 is sufficient to force the existence of four monochromatic 
points on the vertices of a rectangle with sides parallel to the coordinate 
axes. 

3.30 Investigate the same question as in the previous exercise, with three 
colors instead of two. 

3.31 (R. Bacher and S. Eliahou, 2009) Prove that no matter how N74 x N45 


is 2-colored, there exist positive integers i, j, k such that the set 

{(i, 7), (6+ k, 9), (9 +k), @+h,7+k)} 
is monochromatic. There exists a square with horizontal and vertical sides 
and monochromatic vertices. 

Show that there is a 2-coloring of Nyq4 x Nyq that avoids a 


monochromatic square. 


3.9 Sequences and partial orders 


Every sequence of 10 distinct integers contains an increasing subsequence of 
four integers or a decreasing subsequence of four integers (or both). For 
example, the sequence 5, 8, —1, 0, 2, —4, —2, 1, 7, 6 contains the increasing 
subsequence —1, 0, 2, 7. 

This proposition is an existence result. No matter which 10 integers are 
chosen, and no matter in what order they occur, there exists a specific type of 
subsequence, namely, a monotonic subsequence of four integers. 

In this section, we apply the pigeonhole principle to two types of mathematical 
structures: sequences and partial orders. Our goal is to show that arbitrary 


sequences and partial orders contain highly nonrandom substructures. This is 
indicative of a basic principle of existential combinatorics: complete disorder is 
impossible. 

A sequence (finite or infinite) is increasing (or strictly increasing) if a; < a; for 
all i < j; decreasing (or strictly decreasing) if qj > Gj for all i < j; monotonically 
increasing if a; < qj for all i < j; monotonically decreasing if a; = qj for all i <j; 
and monotonic if it is either monotonically increasing or monotonically 
decreasing. 


EXAMPLE 3.4 


The sequence {1, 1, 0, 0, —1, —1,...} is monotonically decreasing. The 
sequence {1, 4, 9, 16, 25, 36,...} is strictly increasing. 
We say that {bj,...,b7} is a subsequence of {a}1,...,a,} if there exists a strictly 
increasing function f: {1,...,.m} > {1,...,n} for which b; = af(i) for all i. 


EXAMPLE 3.5 
The sequence {1, 2, 3, 2} is a subsequence of the sequence {1, 4, 2, 3, 5, 2}. 


Erdés—Szekeres theorem (1935). Let m, n ¢ N. Every sequence of mn + 
1 real numbers contains a monotonically increasing subsequence of m + 1 
terms or a monotonically decreasing subsequence of n + 1 terms (or both). 


Proof. Suppose that S = {a}1,...,dmn+ 1} is a sequence of real numbers. For 1 < k 
< mn + 1, let i, be the length of a longest monotonically increasing subsequence 
starting with aj, and let dy, be the length of a longest monotonically decreasing 
subsequence starting with ax. Then the ordered pairs (ik, dx) are distinct. For if j 
< k and aj < ag, then ij > ix, while if j < k and aj > ax, then dj > dx. But by the 
pigeonhole principle, if 1 < ix < mand 1 < dj < n, for all k, then some ordered 
pairs (ik, dx) are not distinct. Therefore, i, >m+ 1 ord, =n+ 1 for some k. 


EXAMPLE 3.6 


Taking m = 3 and n = 3, the Erdés—Szekeres theorem guarantees that a 
sequence S of mn + 1 = 10 real numbers contains a monotonic subsequence 
of four terms. If the 10 terms of S are distinct, then of course the 


subsequence will be strictly increasing or strictly decreasing. Thus, the 

sequence 5, 8, —1, 0, 2, -4, 1, 7, 6 contains the strictly increasing 

subsequence —1, 0, 2, 7. 
The expression mn + 1 in the Erdes—Szekeres theorem is best possible in the 
sense that there exists a sequence of mn real numbers which contains neither a 
monotonically increasing subsequence of m + 1 terms or a monotonically 
decreasing subsequence of n + 1 terms. We form such a sequence by 
concatenating n sequences of m increasing terms in the following manner. For 
each j € Np, let Sj =— {a1;,...,dmj} be an increasing sequence of m real numbers, 
and suppose that every term of Sj is greater than every term of S;, whenever j < k. 


Then the sequence 

S= {a11,. »+>Q4m1)412,--+;Om2,-++;Gin;-- »+ Oma} 
contains no increasing subsequence of length m + 1 and no decreasing 
subsequence of length n + 1. In general, there are many sequences which avoid 


monotonic subsequences of these lengths. The question of the number of such 
sequences is answered by the theory of Young tableaux. See Notes. 


Erdés—Szekeres theorem (infinitary version). Every infinite sequence 
of real numbers contains an infinite monotonic subsequence. 


Proof. Let S = {aq, a9,...} be an arbitrary infinite sequence of real numbers. We 


will inductively define an infinite monotonic subsequence of S. Relabel the 
elements of S as {a11, a12,...}. By the infinitary pigeonhole principle, there 


exists an infinite subsequence Sj = {a99, a93,...} of S— {a 17} all of whose 
elements are greater than or equal to a7 1 or all of whose elements are less than 
or equal to aj1. Continuing in this manner, we find an infinite subsequence S3 = 
{433, 434, 435,...} of Sp — {a7} all of whose elements are greater than or equal 
to az or all of whose elements are less than or equal to aj9. This process 
continues, defining a new subsequence S; at each step. The subsequence T = 
{d11,422, 433,...} has the property that each element aj; is greater than or equal 


to all elements following it, or less than or equal to all elements following it. 
Again, by the infinitary pigeonhole principle, there exists a subsequence U of T 
each of whose elements is greater than or equal to those following it or each of 
whose elements is less than or equal to those following it. Thus, U is an infinite 
monotonic subsequence of S. 


The set of real numbers R is linearly ordered. For any two distinct real 
numbers x and y, either x < y or y < x. The forthcoming Dilworth’s lemma 
generalizes the Erdés—Szekeres theorem to partially ordered sets. 

Recall that a partial order on X is a relation on X that is reflexive, 
antisymmetric, and transitive. We often denote a partial order by = and write a 
~< b when (a, b) € x. Two elements a, b € X are comparable ifa x borb=x a 
and incomparable otherwise. For example, the relation a x b if and only if a 
divides b is a partial order on N. 

A partial order < on X is a total order (or linear order) if every two elements 
of X are comparable. For example, the usual < relation is a linear order on N. If 
~< is a partial order on X, and Y is a subset of X in which every two elements are 
comparable, then Y is a chain. If no two distinct elements of Y are comparable, 
then Y is an antichain. The length of a partially ordered set X is the greatest 
number of elements in a chain of X, and the width is the greatest number of 
elements in an antichain. 


EXAMPLE 3.7 
Figure 3.3 is the directed graph representation of the partial order 


Figure 3.3 A partial order of width 3 and length 3. 


7 4 
6 3 
5 1 2 


A= {(3, 1), (4, 1), (4, 3), (3, 2), (4, 2), (4, 5), (6, 5), (7, 5), (7, 6)} 
on the set X = {1, 2, 3, 4, 5, 6, 7}. The arrows required for reflexivity and 
transitivity are suppressed in the diagram. The length and width of x are 
both 3. For instance, {1, 3, 4} is a chain of length 3 and {1, 2, 6} is an 
antichain of size 3. 
The length and width of a partial order are related to the size of the underlying 
set by the following result of R. P. Dilworth. 


Dilworth’s lemma. In any partial order on a set of mn + 1 elements, there 
exists a chain of length m + 1 or an antichain of size n + 1. 


Proof. Let X be an arbitrary partially ordered set with mn + 1 elements. Suppose 
that X contains no chain of size m + 1. Then we may define a function f: X — 
{1,...,m} with f (x) equal to the greatest number of elements in a chain with 
greatest element x. By the pigeonhole principle, some n + 1 elements of X have 
the same image under f. By the definition of f, these elements are incomparable; 
they form an antichain of size n + 1. 


EXAMPLE 3.8 


We consider again the partial order of Figure 3.3. The size of X is |X| = 7 = 2 
-3 +1. Therefore Dilworth’s lemma guarantees a chain of 2 + 1 elements or 
an antichain of 3 + 1 elements, and we have remarked that there is a chain of 
length 3. 
Notice the similarity between Dilworth’s lemma and the Erd6ds—Szekeres 
theorem. In each case, the hypothesis concerns a set of mn + 1 elements and the 
conclusion concerns subsets of sizes m+ 1 and n + 1. These similarities are no 
coincidence. In fact, the Erd6s—Szekeres theorem may be proved as a corollary 
of Dilworth’s lemma. Let S = {a},...,dmn+ 1} be a sequence of mn + 1 real 
numbers. Define a partial order x on S by setting aj x qj if aj < qj and i < j. 
Dilworth’s lemma guarantees a chain of m+1 elements (corresponding to a 
monotonically increasing subsequence of m + 1 terms) or an antichain of n + 1 
elements (corresponding to a monotonically decreasing subsequence of n + 1 
terms). 

Just as the Erdés—Szekeres theorem is a best possible result, so also is 
Dilworth’s lemma. In the exercises, the reader is asked to furnish an example of 
a partial order on mn elements with length m and width n. 

Dilworth’s lemma is sometimes easier to apply in the following form. 


Dilworth’s lemma (alternate form). If a partial order on n elements has 
length | and width w, then n < Iw. 


Proof. Suppose, to the contrary, that n > lw + 1. Then, by DilWorth’s lemma, 
there is a chain of length / + 1 or an antichain of size w + 1; these results 
contradict the definition of | or w. 

As you probably suspect, there is an infinitary version of Dilworth’s lemma. 


The proof is an exercise. 


Dilworth’s lemma (infinitary version). A partial order on N has an 
infinite chain or an infinite antichain. 


The title Dilworth’s lemma suggests that there might be a Dilworth’s theorem, 
which is the case. 


Dilworth’s theorem. Let X be a partially ordered set with length / and 
width w. Then X can be partitioned into / antichains or w chains. 


Proof We only prove that X can be partitioned into | antichains. For a proof that 
X can be partitioned into w chains, see [4] or [5]. 

Define f: X — {1,...,[} by letting f(x) be the maximum number of elements in 
a chain with greatest element x. The preimage of each y € {1,...,/} is an 
antichain. 


EXAMPLE 3.9 


Considering Figure 3.3 again, we find that X may be partitioned into three 
antichains {1,2,5}, {3,6}, {4,7}, and three chains {1,3,4}, {2}, {5,6,7}. 


EXERCISES 


3.32 Let n2 + 1 distinct points be given in R2. Prove that there is a sequence 
of n+ 1 points (x1, yq),.-.,(%n4+1, Yn+1) for which xq < xp <* ++ <Xp+41 and 
y¥1 2=y2 2°** 2Yn+1 Or a sequence of n + 1 points for which xj <xp<---< 
Xn+] and yj Sy S++ SYn+1. 
3.33 Give an example of a partial order on mn elements with length m and 
width n. 
3.34 (Putnam Competition, 1967) Let 0 < aj <aj <-+-:+<dmn+i De mn+ 1 
integers. Prove that you can select either m + 1 of them no one of which 
divides any other, or n + 1 of them each dividing the following one. 

Hint: Apply Dilworth’s lemma. 


3.35 For any n2+1 closed intervals of R, prove that n + 1 of the intervals 
share a point or n + 1 of the intervals are disjoint. 

Hint: Let a x f if the closed interval a is entirely to the left of the closed 
interval 6. Apply Dilworth’s lemma. 


3.36 Prove the infinitary Dilworth’s lemma. 

3.37 Prove the infinitary Erdds—Szekeres theorem as a corollary of the 
infinitary Dilworth’s lemma. 

3.38 Let <1 and <2 be two total orders on a set of size n2 + 1. Show that 
there is a subset of size n + 1 on which <1 and x9 totally agree or totally 
disagree. 


2n-2 
3.39 Prove that if 2n — 1 total orders are given on m2" 
some m + 1 points are totally ordered by n agreeing orders. 


+ 1 points, then 


3.6 Subsets 


Let X(t) be the collection of subsets of the t-element set N;, and let € be the 


containment partial order on X(t). For instance, if t = 3, then X(t) consists of 
eight elements: 0, {1}, {2}, {3}, {1,2}, {1,3}, {2,3}, and {1,2,3}. Some 
examples of containment are {1, 2} € {1, 2, 3}, @ € {3}, and {1, 3} € {1, 3}. 


The size of X(t) is 2!. What are the length and width of C? The length is t + 1, 
because the longest chains start with 8 and include one new element at each step 
until all t elements are included. The width of € is the subject of Sperner’s 
theorem. Remember that an antichain of X(¢) is a collection of subsets of N; none 


of which contains another. 

Emanuel Sperner (1905-1980) proved the following result in 1928. We give a 
simpler proof essentially due to D. Lubell. See [9]. For a proof of Sperner’s 
theorem using the probabilistic method, see [2]. 


Sperner’s theorem. An antichain of subsets of N¢ (under the usual C 


order) has at most (ie2 ) elements. Furthermore, the (i¢/2 ) subsets of size 
[t/2] form an antichain. 


Note that Sperner’s theorem tells us that the width of the partially ordered set 
X(t) is (2/24). 

Proof Let A = {Aj,...,Am} be an antichain of subsets of N¢ with |Aj| = aj; at for 1 
<i<m. For each i, the set A; is contained in exactly aj!(t — a;)! chains of length t 
+ 1. (Such chains commence with the empty set, add one element at a time until 


Aj is exhausted, then add one element at a time until the complement of Aj; is 


exhausted.) Because these chains are distinct and there are t! chains of length t + 
1 altogether, 


m™m 


S| ai!(t — ai)! < tl. 
==] 


Dividing this inequality by t! we obtain 


s(i) <1. 


i=1 


Since Ga) is maximized when k = [t/2], it follows that 


lay! = 3 “as am 


Therefore 


si (10/2)) 


What are the antichains of X (t) with (i472 ) elements? Equality in the above 


relation can be attained only if (ie/2 ) t= eh ie for each qj. If t is even, this 
forces a; = t/2. If tis odd, then a; can equal (t — 1)/2 or (t + 1)/2. We now prove 


that if t is odd, then all elements of the antichain are size (t — 1)/2 or all are size 
(t + 1)/2. The proof is essentially due to Lasl6 Lovasz. 


Theorem. Let A be an antichain of X(t) containing (iej2 ) elements. If t is 
even, then A is the collection of all subsets of N; of size t/2. If t is odd, 


then A is the collection of all subsets of size (t — 1)/2 or the collection of 
all subsets of size (t + 1)/2. 


Proof. We have already demonstrated the t even case. Suppose that t = 2u + 1. 
As each maximal chain in X(t) contains exactly one element of A, if U is a subset 
of size u, Vis a subset of size u + 1, and U CV, then A contains exactly one of U 
and V. Suppose that U is a subset of size u contained in A, and U’ is any other 
subset of size u. Then there is a sequence of subsets 


U= Ui, Vi, U2, Va, ceey Va-1) U, = ty 
beginning with U and ending with U’, whose sizes alternate between u and u + 


1, and such that V; contains Uj, and Uj+1 for each i. It follows that U’ is an 
element of A. Because U’ was arbitrary, A contains all subsets of size u. A 
similar argument shows that if A contains at least one subset of size u + 1, then A 
contains every subset of size u + 1. 

We are now ready to look at relations which are not transitive. In Chapter 4 we 
begin by discussing graphs, where the relations are merely reflexive and 
symmetric. The theorems are more difficult to prove in this more general setting 
—and the analysis of best possible results is much more difficult. 


EXERCISES 


3.40 Let ay,...,dn, b€ R, with all aj = 1. Show that the maximum number of 
sums taj + aj +: ++ +apy in the open interval (b, b + 2) is (in/2} ). 

3.41 Let a1,...,ap be positive real numbers. Show that the maximum number 
of equal sums €1a1 +++ + Edy (Ej = 0 or 1) is (\n72)): 

See [3] and [26] for a discussion of the Littlkewood—Offord problem 
concerning the number of sums Xj €jz; (Ej = +1 and |z;| = 1) lying inside 
any given circle in the complex plane. 


Notes 


Johann Peter Gustav Lejeune Dirichlet (1805-1859) was the first mathematician 
to explicitly use the pigeonhole principle in proofs. He referred to it as the 
“drawer principle.” 

The word “graph” was first used in mathematics in an 1877 paper by James 
Sylvester (1814-1897). In 1936 Dénes K6nig (1884—1944)> wrote the first book 
on graph theory, Theorie der endlichen and unendlichen Graphen. 

The special case of Turan’s theorem (1941) was proved by W. Mantel in 1907. 

The Erdés—Szekeres theorem was proved in 1935 and may be regarded as a 
sort of proto-Ramsey theorem (even though Ramsey’s theorem was proved in 
1930). 

According to the Erdés—Szekeres theorem, every permutation of the numbers 
1,...,10 contains a monotonic subsequence of length four. But this result does 
not hold for permutations of the numbers 1,...,9; there are many permutations of 
1,...,9 that do not have a monotonic subsequence of length four. The question of 


exactly how many is answered by the theory of Young tableaux. For a discussion 
of Young tableaux, see [19] and [26]. We give a few details here. 

Let n= Ay + +++ +Apy. A Young tableau of shape 41 + +++ + Ap is a Ferrers 
diagram of shape (Aj,...,Aj) in which each dot has been replaced by a different 
integer from the set {1,...,.n}. The number n is the order of the tableau. The 
positions in a tableau are called cells. A standard tableau is one in which the 
integers increase in every column and in every row. 

Figure 3.4 shows an example of a standard Young tableau with n = 9 = 4+3+2. 


Figure 3.4 A standard Young tableau of shape 4 + 3 + 2. 


13 4 8 
29 9 
6 7 


How many standard Young tableaux have this shape? The answer is given by 
the hook-length formula. We define the hook-length of a cell to be one more than 
the number of cells to its right and below it. Figure 3.5 shows the hook-lengths 
of the cells of the tableau of Figure 3.4. 


Figure 3.5 The hook-lengths of the cells of a tableau. 


65 3 1 
43 1 
2 1 


The hook-length formula says that the number of standard Young tableaux of a 
given shape is equal to n! divided by the product of the hook-lengths. Thus, the 
number of standard Young tableaux of shape 4 + 3 + 2 is 

9! 
6-5-4-3-3-2-1-1-1 

There are 30 partitions of the number 9. For each such partition A, let f, be the 


168. 


number of standard tableaux of shape A. If you compute these numbers (via the 
hook-length formula), you may be surprised to find that 


> =9!. 
A 


This identity (for any positive integer n) is known as Schur’s formula. It is used 
in the theory of representations of the symmetric group. 


Schur’s formula. For each partition A of n, let f, be the number of 


standard Young tableaux of shape A. Then 


YR =ni. 
d 


Schur’s formula indicates that there is a correspondence between pairs of 
standard Young tableaux of identical shape and permutations of n. This 
correspondence is effected by the Robinson—Schensted algorithm. 

We give an example of the algorithm. Let n = 9 and 

o = 583276491. 

We will construct an ordered pair (P, Q) of standard Young tableaux (of the 
same shape) corresponding to o. 

The first task is to construct P. (We will construct Q later.) We read the 
permutation o from left to right and construct P step by step. The 5 is placed in 
the top left position of the tableau, and the 8, being greater than 5, is placed 
below: 

5 

8 
Now we come to the 3. Being less than 5, the 3 “bumps” the 5 to the right and 
takes its place: 

a. 3 

8 
Likewise, 2 is less than 3, so it bumps the 3 to the right (arid the 5 along with it) 
and takes its place: 

Z> is 3 

8 
Now comes the 7. Because 7 is less than 8, it is inserted into the second row, 
bumping the 8 to the right. Then the 6 bumps the 7 (and the 8 along with it): 

23 3 

a a 
Next, the 4 bumps the entire second row to the right, and the 8 is bumped up to 
the first row: 

2.3 3: 3 

4 6 7 
Finally, the 9, being greater than 4, is placed in the third row, and the 1 bumps 
the first row to the right: 


i 2as 8 
4 67 


9 
This completes the tableau P. The tableau Q consists of the numbers 1,...,9 in a 
tableau of the same shape as P and in the order in which new positions were 
occupied in P. Figure 3.6 shows P and Q. 


Figure 3.6 The tableaux P and Q corresponding to 0 = 583276491. 
1 


12 3 5 8 3.4 7 #9 
4 6 7 2 5 6 
g 


In the Robinson—Schensted correspondence, the number of columns of the 
tableaux P and Q is equal to the length of a longest decreasing subsequence of 
the permutation, and the number of rows is equal to the length of a longest 
increasing subsequence. In our example, the tableaux have five columns and 
three rows, and indeed, a longest decreasing subsequence of o has five terms and 
a longest increasing subsequence has three terms. 

To calculate the number of permutations of {1,...,9} with no monotonic 
subsequence of length four, we use the hook-length formula to find the number 
of standard Young tableaux of shape 3 + 3 + 3: 

9! 
5:4-4-3-3-3-2-2-1 — 

Since the desired permutations correspond to ordered pairs of standard Young 


42. 


tableaux of this shape, the number of such permutations is 422 = 1764. 
We leave five problems for you to ponder: 
1. How does the reverse direction of the Robinson—Schensted algorithm 
work? 
2. In the Robinson—Schensted correspondence, show that the number of 
columns of the tableaux P and Q is equal to the length of the longest 
decreasing subsequence of o. Show that the number of rows is equal to the 
length of the longest increasing subsequence of o. 
3. If the permutation o corresponds to the pair of tableaux (P, Q), show that 
=I 


o ~ corresponds to the pair (P, Q). 


4. A permutation 0 € Sy, is called an involution if o = o!. Show that the 
number of involutions in S,, equals the number of standard tableaux of order 


n. Recall that these were counted in Exercise 2.69. 


5. Show that the number of standard Young tableaux of shape 2 = n is given 
by the Catalan number Cp. 


CHAPTER 4 


RAMSEY THEORY 


The Erdés—Szekeres theorem and Dilworth’s lemma guarantee the existence of 
particular substructures of certain combinatorial configurations. Large 
disordered structures contain ordered substructures. We continue this theme by 
presenting two cornerstones of Ramsey theory: Ramsey’s theorem on graphs and 
van der Waerden’s theorem on arithmetic progressions. In the process we 
discuss related results, including Schur’s theorem on equations. We also 
investigate bounds and asymptotics of Ramsey numbers using techniques from 
number theory and probability. The central pursuit is always to find ordered 
substructures of large disordered structures. We want to find order in 
randomness. 


4.1 Ramsey’s theorem 


The following problem appeared in the 1953 William Lowell Putnam 
Mathematical Competition: 

Six points are in general position in space (no three in a line, no four in a 

plane). The fifteen line segments joining them in pairs are drawn and then 

painted, some segments red, some blue. Prove that some triangle has all its 
sides the same color. 

The description of the six points in general position and the segments joining 
them in pairs is just another way of defining the graph Kg. We introduce a few 
more graph theory terms. A coloring of the set of edges of a graph G is a 
function f: E(G) > S, where S is a set of colors. A coloring partitions E(G) into 
color classes. If f is constant, then G is monochromatic. 

Now we may rephrase the Putnam question as follows: If each edge of Kg is 
colored either red or blue, then there is a monochromatic subgraph K3 (a 
triangle). We note that the coloring may be done in an arbitrary manner. In fact, 
because Kg has (8) = 15 edges, there are 215 possible red—blue colorings of the 
edges of Kg. The claim is that every one of these 32,768 colorings yields a 


monochromatic K3. (We assume that the vertices of Kg are labeled, so we can 


distinguish between differently labeled isomorphic graphs, and that all 15 edges 
can be the same color, a possibility disallowed in the Putnam problem as stated.) 

Here is a simple solution to the problem using the pigeonhole principle. 
Choose any vertex v of Kg. By the pigeonhole principle, some three of the five 


edges emanating from v are the same color. Without loss of generality, suppose v 
is joined by red edges to vertices x, y, z. If any of the edges xy, yz, or xz is red, 
then there is a red triangle (vxy, vyz, or vxz). However, if each of these edges is 
blue, then xyz is a blue triangle. 
A special notation has been introduced to state results of this type. We write 
(4.1) Ks — (Ks)2 
to indicate that every 2-coloring of the edges of Kg yields a monochromatic K3. 
This relation may also be written 


Ke—Ks. 
(42) "2 * 


Similarly, we write 
(4.3) Ks 7 (Ka)2 
to say that there is a 2-coloring of K5 with no monochromatic K3. It is 


equivalent to say that there is a graph G on five vertices such that neither G nor 
G contains a K3. Such a graph is exhibited in Figure 4.1. 


Figure 4.1 A graph G such that neither G nor G contains K3. 


In general, we write 
(4.4) Kn — (Km)o 
to indicate that every 2-coloring of the edges of Ky, yields a monochromatic Ky. 


In 1930 F. Ramsey established the existence of such a Ky, for each m. Unlike the 


authors of the Putnam problem, we prefer green—red colorings to red—blue 
colorings. 


Ramsey’s theorem (1930). Given a, b > 2, there exists a least integer 
R(a,b) with the property that every green—red coloring of the edges of the 
complete graph on R(a,b) vertices yields a green Kg or a red Kp. 


Furthermore, 
(4.5) R(a, 6) < R(a — 1,6) + R(a,b—1), 
for all a, b > 3. 


Proof. We employ induction on a and b. The basis of the induction consists of 
the statements R(a, 2) = a and R(2, b) = b. These are trivial. In the first assertion, 
if we 2-color Kg and any edge is red, then we obtain a red Kp, while if no edge is 


red, then we obtain a green Kg. Thus R(a, 2) < a. Equality follows from the fact 
that an all-green-colored Kg—1 contains neither a green Kg nor a red Kp. The 


second assertion is proved similarly. Now, assuming the existence of R(a — 1, b) 
and R(a, b — 1), we will show that R(a, b) exists. Let G be the complete graph on 
R(a — 1, b) + R(a, b — 1) vertices, and let v be an arbitrary vertex of G. By the 
pigeonhole principle, at least R(a — 1, b) green edges or at least R(a,b-1) red 
edges emanate from v. Without loss of generality, suppose that v is joined by 
green edges to a complete subgraph on R(a — 1, b) vertices. By definition of R(a 
— 1, b), this subgraph must contain a red Kp or a green Kg— . In the latter case, 
the green Kg—1 and v, and all the edges between the two, constitute a green Kg. 
We have shown that G contains a green Kg or a red Kp. Therefore, R(a, b) exists 
and satisfies 

R(a,b) < R(a — 1,6) + R(a,b—1). 

The values of R(a, b) are called Ramsey numbers. Very few nontrivial Ramsey 
numbers (with a or b greater than 2) have been determined. The fact that we 
have proved that the Ramsey numbers exist but we do not know their values 
illustrates one disadvantage of existential proofs. 

By definition, R(m, m) is the least positive integer n for which Ky > (Kyp)p. 
The values R(m, m) are called diagonal Ramsey numbers because they appear on 
the main diagonal of a table of Ramsey numbers. We know one diagonal 
Ramsey number already: R(3, 3) = 6. 


We note that R(a, b) = R(b, a) for all a, b = 2, as the roles of the two variables 
a and b are symmetric. Furthermore, we have noted that R(a, 2) = a for all a. We 
have already proved that R(3, 3) = 6, but a second proof is furnished by the two 
observations just made and the inequality R(a, b) < R(a — 1, b) + R(a, b — 1). 
Thus, R(3, 3) < R(, 2) + RQ, 3) = 3 + 3 = 6. The lower bound R(3, 3) > 5 is 
verified by construction as before. 


EXAMPLE 4.1 Confusion graph 


The confusion graph is defined as follows. Suppose that the vertices of the 
5-cycle Cs are a, b, c, d, e (in cyclic order) and these vertices represent 


symbols transmitted over a noisy channel. Adjacent symbols are said to be 
confusable; each is easily mistaken for the other. Nonadjacent symbols are 
not confusable. Thus, c and d are confusable while c and e are not. The 
independence number a(G) is the maximum number of nonconfusable 
symbols in V(G). It is easy to see that a(Cs) = 2, as illustrated by the pair a, 


d. For two finite graphs G and H, we define a new graph called the strong 
product G & H as the set V(G) x V(H), with (g, h) adjacent to (g’, h’) if and 
only if g is adjacent to or equal to g’ and h is adjacent to or equal to h’. We 
think of a(G & H) as the maximum number of nonconfusable ordered pairs 
in the set V(G) x V (HA), where nonconfusability means nonconfusability in 
at least one coordinate. (We are assuming that a symbol is confusable with 
itself.) 

If A is a nonconfusable subset of V(G) and B is a nonconfusable subset of 
V(H), then A x B is a nonconfusable subset of V(G) x V (H). Therefore 
a(G)a(H) < a(G & H). Ramsey’s theorem furnishes a strict upper bound: a(G 
&) H) < R(a(G) + 1, a(H) + 1). For suppose the upper bound is attained by a 
subset A x B of V(G) x V(H). Color an edge green if there is 
nonconfusability in the first coordinate and red if there is nonconfusability in 
the second coordinate. (If nonconfusability holds in both coordinates, then 
we color the edge green.) Ramsey’s theorem guarantees that A has at least a 
(G) + 1 points or that B has at least a(H) + 1 points—both contradictions. 


Putting our lower and upper bounds together we obtain 4 = a(Cs)? < a(Cs) 
) C5) < R(a(Cs) + 1, a(C5) + 1) = RG, 3) = 6. Is the value of a(Cs5 & Cs) 4 
or 5? 


EXERCISES 


4.1 For the confusion graph Cs, show that a(Cs & Cs) = 5. 


4.2 A tournament is a complete directed graph. Use Ramsey’s theorem to 
show that for every n there exists an f(n) such that every tournament on f(n) 
vertices contains a transitive subtournament on n vertices. 


4.3 Prove that every 2-coloring of the edges of Kg yields two 
monochromatic triangles. 


Hint: Assign to each pair of edges incident at a vertex a score of +2 if they 
are the same color and —1 if not. 


4.2 Generalizations of Ramsey’s 
theorem 


What we can do with two colors, we can do with an arbitrary number, as the 
following generalization of Ramsey’s theorem shows. All the theorems of this 
section were proved by Frank Ramsey in the original 1930 paper. See Notes. 


Ramsey’s theorem for multiple colors. For any c = 2 and qj,...,dc = 2, 
there exists a least integer R(q1,...,dc¢) with the following property: If the 
edges of the complete graph on R(aq,...,a-) vertices are partitioned into 
color classes Aj,...,Ac, then for some i there exists a complete graph on 
aj, vertices all of whose edges are color Aj. 


Proof. The case c = 2 is covered by our previous version of Ramsey’s theorem. 
Suppose R(q1,...,dc—1) exists for all aj,...,de_j = 2. We claim R(q},...,d¢) 


exists and satisfies R(aj,...,dc) < R(R(q1,...,4c-1), Ac). A c-coloring of the 
complete graph on R(R(q}1,...,dc¢—1, dc) vertices may be regarded as a 2-coloring 
with colors {Aj,...,Ac—;} and Ap. Such a coloring contains a complete graph on 
dc vertices colored Ac or a (c — 1)-colored complete graph on R(q],...,dc—1) 


vertices, in which case the induction hypothesis holds. In either case, we obtain a 
complete subgraph on the required number of vertices.? 


EXAMPLE 4.2 


We use the c-color Ramsey theorem to prove a weak version of Dilworth’s 
lemma. Recall that Dilworth’s lemma states that every partial order on mn + 
1 elements contains a chain of length m + 1 or an antichain of size n + 1. For 
k sufficiently large, we define a coloring of the complete graph on the vertex 
set X — {X},...,.X as follows: Assuming i < j, color edge x;xj blue if xj and 
xj are incomparable; green if xj < xj; and red if xj = xj. (Some edges may be 
colored in two ways, but it won’t matter.) Now if k = R(n+1,m+1,m-+ 1), 
then we are guaranteed a blue K,,+ 1 (corresponding to an antichain of size n 
+ 1), a green Kjy+ 1 (corresponding to a chain of x; with increasing 
subscripts), or a red K,+1 (corresponding to a chain of x;, with decreasing 
subscripts). Thus we have a weak version of Dilworth’s lemma. It is true that 
the best possible value mn + 1 has been replaced by the presumably much 
larger value R(n + 1, m+ 1, m+ 1). However, we have gained information 
about the increasing or decreasing nature of the subscripts of the x;. It would 
be unreasonable to expect that the best possible value mn + 1 would be 
obtained by this proof, because Dilworth’s lemma assumes a transitive 
relation while Ramsey’s theorem does not. Thus Ramsey’s theorem is more 
general than Dilworth’s lemma. 
We write 

(4.6) Ky, == (Ks do 
to indicate that every c-coloring of Ky, yields a monochromatic Ky. The c-color 
Ramsey numbers R(qj,...,d-) satisfy certain trivial relations, e.g., they are 
symmetric in the c variables. Furthermore, R(q1,...,d¢—1, 2) = R(a1,...,d¢—}) for 
all aj, because either there is an edge colored Ac, or else all edges are colored 
from the set {Aj,...,Ac—j}. 

A hypergraph H of order n is a collection of nonempty subsets of an n-set S of 
vertices. The elements of H are called edges, corresponding to the graphical case 
in which edges are two-element subsets. A hypergraph is t-uniform if all edges 
have cardinality t. The complete t-uniform hypergraph of order n is the 
collection [S]' of all t-subsets of S. One may visualize and draw hypergraphs 
with edges represented by ovals (cardinality > 2), lines (cardinality = 2), and 
circles (cardinality = 1), as in Figure 4.2. 


Figure 4.2 A hypergraph with edges {1,2,3}, {1,4}, and {5}. 


It is now possible to state the most general version of Ramsey’s theorem in its 


natural hypergraph setting. We sometimes write [Nn If as [n]!. 


Ramsey’s theorem for hypergraphs. Let c > 2 and qj,...,q¢ 2 t = 2. 
There exists a least integer R(ay,...,a¢; t) with the following property: 


Every c-coloring of the complete t-uniform hypergraph [R(a},..,act)I° 
with colors Ajy,...,A¢ yields a complete t-uniform hypergraph on qj 


vertices in color Aj, for some i. 


Proof The generalization from 2-colorings to c-colorings works as it did in the 
generalization from the two-color to the c-color Ramsey theorem. We leave the 
argument as an exercise. We know that R(aq, aj; 2) exists for all a1, ap = 2. Let 


us assume that R(a1, a; t — 1) exists for all aq, ap = 2 and that R(aj — 1, ap; 0) 
and R(a1, az — 1; t) exist. We claim that R(a1, a9; t) exists and satisfies R(a1, 
ap; t) <n, where 
n=1+ R(R(a; — 1, a2;t), R(ai1, a2 — 1;t);t —1). 

Suppose [Nn I has been green-red colored, and let v be one of its vertices. We 
generate an induced 2-coloring of [Ny - tyyyel by assigning to each (t — 1)-set 
A of Ny — {v} the color which has been assigned to the t-set A U {v}. By 
definition of n, we know that [Ny - fy}je1 contains a green [R(a; — 1, a2;t)|*~? 
or a red [R(a;, a2 — 1;t)|'~'. Without loss of generality, suppose there is a green 
(R(a, — 1,a2;t)|*~' on vertex set A. By definition, [Ale contains a red [ap] ora 


green [a1 — rye In the latter case, [A U {y}]f contains a green [az]}°. We have 


shown that [Nn I contains a green [ay]! or a red [ap}°, as required. 
An application of the theorem to convex sets occurs in the exercises. Now we 
discuss infinitary versions of the Ramsey theorems. We write 


(4.7) sll 
to indicate that every c-coloring of the complete countably infinite graph yields a 
monochromatic complete countably infinite subgraph. Similarly, in the 
hypergraph setting, the infinite t-uniform complete graph Kool!) consists of a 
countably infinite set and all possible t-element subsets. We write 


(4.8) Ke KO 


to indicate that every c-coloring of the t-uniform complete infinite graph yields a 
monochromatic t-uniform complete infinite subgraph. 


Ramsey’s theorem for infinite graphs. For every c > 2, we have 
} ey EEE 
¢ 


Proof. Define f: N — {1,...,c} as follows. Let n = 1 and Xp, = V (Koo). Choose 
Xn € Xp, and let Aj = {ve Xp : edge vxp is color i}. By the infinitary pigeonhole 
principle, some Aj is infinite. Let X,+1 = Aj, and define f(n) = i accordingly. 
Replace n by n + 1 and repeat this process. 

This recursive procedure defines the function f. Some f 1 (j) is infinite and the 


complete graph on vertex set {x,: nef 1(j)} is monochromatic. 


Ramsey’s theorem for infinite hypergraphs. For every t > 2, c = 2, we 
have 


KY = KW), 


The infinite hypergraph Ramsey theorem includes the infinite graph Ramsey 
theorem as a special case (when t = 2). The proof of the infinite hypergraph 
Ramsey theorem is left as an exercise. 

We close this section by indicating how Ramsey’s theorem for infinite graphs 
implies Ramsey’s theorem for finite graphs. The technique for doing this, the 
“compactness principle,” is used throughout combinatorics. Assuming the truth 
of the infinite graph Ramsey theorem, we prove the finite graph Ramsey 


theorem by contradiction. Suppose that there exists a k for which R(k, k) does not 
exist. For each i = k, let f; be a 2-coloring of Kj without a monochromatic Kk. 


We assume the Kj, are nested: Kk © Kx+1 © Kkiz © °°: *. By the infinitary 


pigeonhole principle, there exists an infinite subset of functions {fi c (fis 


which agree on Kx. Similarly, there is an infinite subset of functions { fikt1y Cc 


(fik} agreeing on Kj+j;, etc. This process yields an infinite 2-coloring of Koo 
without a monochromatic Kj, contradicting the infinite graph Ramsey theorem. 
Therefore, R(k, k) exists for each k, which implies that R(a1, az) exists, since it 
must satisfy the inequality R(aj, az) < R(max{aj, ap}, max{ay, az}). In the 
same manner, the finite hypergraph Ramsey theorem for an arbitrary number of 


colors is proven from the infinite hypergraph Ramsey theorem for an arbitrary 
number of colors. 


EXERCISES 


4.4 Prove the following result of Erdés and Szekeres (1935): For every m, 
there exists a least integer n(m) such that any set of n(m) points in the plane 
contains m points which determine a convex m-gon. 

Hint: n(m) satisfies n(m) < R(5, m; 4). Actually, Erdés and Szekeres proved 


that n(m) > 272 + 1 and conjectured that n(m) = 22 1. The 
determination of n(m) remains an open problem. 

4.5 Prove that among infinitely many points in the plane there are infinitely 
many collinear points or infinitely many points no three of which are 
collinear. Prove also the three-dimensional analog of this problem: Among 


infinitely many points in R3 there is an infinite planar subset or an infinite 
subset containing no four coplanar points. 
4.6 A c-coloring of the edges of a graph is surjective if all c colors are used. 
For a = b = 1, let P(a, b) be the following proposition: Every surjective a- 
coloring of the countably infinite complete graph yields a surjectively b- 
colored infinite complete subgraph. 
(a) Show that P(a, b) is true if b = 1, b = 2, or a = D. It is conjectured that 
these are the only a and b for which P(a, b) is true. 
(b) Show that P(10, 8) and P(46, 15) are false. 
4.7 Prove that if Koo,00 is 2-colored, there exists a monochromatic Koo, 00: 


Interpret this result as a proposition about 2-colorings of the infinite square 
lattice. 
4.8 (Putnam Competition, 1988) (a) If every point of the plane is painted one 
of three colors, do there necessarily exist two points of the same color 
exactly one inch apart? 
(b) What if “three” is replaced by “nine”? 
Justify your answers. 
The answer to (a) is yes and the answer to (b) is no. The minimum 
number of colors necessary to force the conclusion in part (a) is not 
known. See [18]. 
4.9 (Putnam Competition, 1994) Show that if the points of an isosceles right 
triangle of side length 1 are each colored with one of four colors, then there 
must be two points of the same color which are at distance at least 2 — v2 
apart. 
4.10 Find an infinite graph G with the following three properties: 
(1) G contains no Kg. 


(2) The addition of any edge to G completes a Ky. 
(3) There is a 2-coloring of the edges of G with no monochromatic K3. 


In 1988 Joel Spencer used the probabilistic method to prove the existence 
of a finite graph G with properties (1) and (2) without property (3) and 
with fewer than 3 - 10% vertices. His result answers a question of Erdgs, 


who asked whether there exists such a graph with at most 1010 vertices. 
See J. Spencer, Three hundred million points suffice, Journal of 
Combinatorial Theory (A) 49 (1988), 210-217, and Erratum, Journal of 
Combinatorial Theory (A) 50 (1989), 323. 

For m, n > 3 and p > max{m, n}, the Folkman number F (m, n; p) is 
defined as the minimum number of vertices in a graph G with the 
properties (1) the largest complete graph contained in G has p vertices 
and (2) any green—red coloring of the edges of G yields a green Kj, or a 


red Ky. In 1967 Jon Folkman proved the existence of these numbers. 
Spencer’s construction proves F(3, 3; 3) <3- 109. Other than the Ramsey 


numbers (i.e., when p = R(m, n)), the only known Folkman numbers are 
F(3, 3; 5) = 8 and F(3, 3; 4) = 15. See [10, p. 1373]. 


4.3 Ramsey numbers, bounds, and 
asymptotics 


Until now our results have been purely existential We have shown that 
sufficiently large structures contain desired nonrandom substructures. But how 
large is sufficiently large? In general, this quantification question is extremely 
difficult, and unsolved problems abound. We present a few calculations and 
proofs in this section and summarize the scant amount of information known 
about Ramsey numbers. 

We have already shown that R(3, 3) = 6. Let us try to evaluate the next more 
complicated Ramsey number, R(3, 4). To obtain an upper bound, we use the 
inequality R(a, b) < R(a— 1, b) + R(a, b— 1). Thus R(3, 4) < R(3, 3) + RQ, 4) = 
G + 4 = 10. However, R(3, 4) turns out to be 9, not 10. For suppose there is a 
green-red coloring of Kg which has no green K3 and no red Ky. Because R(2, 4) 


= 4 and R(3, 3) = 6, each vertex must be incident with exactly three green edges 
and five red edges. But this means that the sum of the degrees of the vertices of 
the green subgraph is 9 - 3 = 27, contradicting the fact that the sum of degrees is 
always even (the Handshake Theorem). Hence R(3, 4) < 9. In the exercises, the 
reader is asked to furnish a 2-coloring of Kg containing no green K3 and no red 
Kg, thereby proving R(3, 4) = 9. 

The Ramsey number R(3, 5) is evaluated easily: R(3, 5) < R(3, 4) + R(2, 5) =9 
+ 5 = 14. In the exercises, the reader is asked to find a 2-coloring of K13 that 
shows R(3, 5) > 13, thus establishing R(3, 5) = 14. 

Next we determine R(4, 4). We have the upper bound R(4, 4) < R(4, 3) + R(, 
4) =9 + 9 = 18, and 18 turns out to be the value of R(4, 4). To prove this, we 
need a green-red coloring of Kj7 containing no monochromatic Ky. In general, 
colorings which establish lower bounds tend to look locally random. However, 
they must contain quite a bit of structure so that they can be manipulated and 
analyzed. Such pseudorandom constructions are employed throughout 
combinatorics. 

Let us assume that the vertices of K;7 are labeled with the residue classes 
modulo 17: 0, 1, 2,...,16. An edge ij is colored green or red according to the 
quadratic character of i-j modulo 17. The 16 nonzero residues fall into two 
classes, quadratic residues and quadratic nonresidues. The set of quadratic 


residues modulo 17 is 

Ry7 = {x? : x € Z}7} = {1,4, 9, 16, 8, 2, 15, 13}, 
the set of quadratic nonresidues is 

N17 = {3, 5, 6, 7, 10, 11, 12, 14}. 
Note that R17 is the range of the homomorphism f : Z*17 > Z*17, x x2. Both 
R17 and N77 are closed under multiplication by —1 (because —1 = 16 € R17), soi 
—j has the same quadratic character as j — i. Edge ij is colored green if i—j ¢ R17 
and red if i—j €¢ N17. Suppose that there is a monochromatic Ky on vertices a, b, 
c, d. Note that the coloring is translation invariant: (i + k) -G + k) =i-/j. 
Therefore we may assume that a = 0. Multiply each vertex by pt (the 


multiplicative inverse of b), and note that either no edge changes color (if b € 
R17) or else every edge changes color (if b € N17). The reason for this is that 


b tip hj = bi — j). In either case, we now have a monochromatic Ky on 


vertices 0, 1, cb 1, db-1, Now, because 1 is a quadratic residue, the other 


differences cb~1 db71, cb —1, db~1 —1, db-! — cb~! must be quadratic 
residues. Upon inspection of the elements of R17, we see that this is impossible. 
Therefore R(4, 4) > 17, and we conclude that R(4, 4) = 18. 

The other two-color Ramsey numbers R(a, b) are considerably more difficult 
to evaluate. The above construction involving quadratic residues was discovered 
in 1955 by R. E. Greenwood and A. M. Gleason. Although it gives the exact 
Ramsey number in the case of R(4, 4), the method gives only bounds for higher 
numbers. For example, using this technique one can show that 38 < R(5, 5), but 
in fact other techniques have shown that 43 < R(5, 5). We present all of the 
known nontrivial Ramsey numbers in Table 4.1. The notation //u means that | 
and u are the best known lower and upper bounds for that particular Ramsey 
number. We refer readers to [11] and to the dynamic survey by S. Radziszowski 
found in the Electronic Journal of Combinatorics at 
http://www. combinatorics.org. 


Table 4.1 Ramsey numbers R(a, b). 


23 28 36 
18 25 36/41 49/61 56/84 73/115 
43/49 58/87 80/143 101/216 126/316 
102/165 113/298 132/495 169/780 

205/540 217/1031 2241/1713 

282/1870 317/3583 

565/6588 
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Open problem. Determine R(5, 5). 


Open problem. Determine a formula for R(n, n). 


We know that R(5, 5) < R(4, 5) + R(5, 4) = 50. Unfortunately, this still leaves 
an enormous computation problem in evaluating R(5, 5). The naive approach, 
examining all 2(*) labeled graphs on 49 vertices, is intractable. 

When we consider more than two colors, the only known nontrivial Ramsey 
number is R(3, 3, 3) = 17, whose proof we leave as an exercise. The only known 
nontrivial t-uniform hypergraph Ramsey number with t > 3 is R(4, 4; 3) = 13. 
This state of limited knowledge is exasperating because Ramsey numbers are 
intimately connected with other numbers and functions, as we shall see later in 
this chapter. Any new Ramsey number would be very valuable. 

Let us now consider lower and upper bounds for the diagonal Ramsey 
numbers R(a, a). The trivial lower bound R(a, a) > (a — 1/2 is immediate: join 
with green edges a — 1 disjoint copies of a red Kg—1; this coloring has no 
monochromatic Kg. A more sophisticated lower bound is obtained by the 
probabilistic method in the next section. 


To find an upper bound we use Pascal’s identity. We recall that R(a, 2) = a for 
a >2 and R(a, b) < R(a—1, b) + R(a, b—1) fora, b> 3. 


Upper bound for Ramsey numbers. For all a, b => 2, we have R(a, b) < 
[eres 
a-l1 # 
Proof We use induction on a and b, noting that R(a, 2) = a = (,%,) and R(2, b) = 
b= fn so the inequality holds when b = 2 or a = 2. Suppose that the inequality 
holds for R(a — 1, b) and R(a, b — 1) for a, b = 3. Then 


R(a, b) 


lA 


R(a — 1,6) + R(a,b— 1) 
ee + =) 
a-2 a-1l 
a+b—2 
C1): 


and the inequality is established.? 


For diagonal Ramsey numbers, the upper bound becomes R(a, a) < igeaeyt and 
we can determine a asymptotic estimate. One of the great open problems of 


lA 


Ramsey theory is to calculate limg -, oo R(a, a) @ (if it exists). 
It follows that 


2a—2 
< 
R(a,a) < ‘oes 


< g2a—2 
(4.9) a7. 


Thus, we obtain an asymptotic upper bound for R(a, a)! a 


I 


(4.10) lim sup R(a, a)!/* < limsup4¢-))/4 = 4, 
Using Stirling’s asymptotic approximation to the factorial function, 
(4.11) nt ~ n™e—"(2nn)"?, 
we can improve the upper bound a little. Since (*°-) < (*"), and 
(%*) < (2a)**e~** (22a)? 
(4.12) \@ a*%e-2427q 
it follows that 


R(a,a) < 


4® 
(na)i/2 ( 1+ at 1 )) ; 
where o(1) is a function of a which tends to 0 as a tends to ™. 


A lower bound for lim inf R(a, a) @ is determined in the next section. 


EXERCISES 
4.11 Find a 2-coloring of Kg that proves R(3, 4) > 8. 
4.12 Find a 2-coloring of K73 that proves R(3, 5) > 13. 


4.13 Prove R(3, 3, 3) = 17. 
4.14 Prove that if K397 is 5-colored, there exists a monochromatic K3. 


4.4 The probabilistic method 


To obtain a good lower bound for R(a, a), we turn to the probabilistic method, a 
technique used widely throughout existential combinatorics. See [2]. The idea is 
to turn the objects in question (green—red colorings of a graph) into events in a 
probability space and demonstrate that a desired event (a coloring containing no 
monochromatic subgraph of specified size) occurs with positive probability. If D 
is a set of desired objects in a sample space S, then the probability that a random 
object is desired equals Pr(D) = |DJ/|S|. If we can show that Pr(D) is positive, 
then D is nonempty and there exists a desired object. Usually, probabilistic 
arguments can be framed directly in terms of the cardinalities |D| and |S]. 
However, the probabilistic language has undisputed bookkeeping and conceptual 
advantages in proving complex theorems. To illustrate the distinction and 
parallelism between the two points of view, we present two proofs of the 
following lower bound for R(a, a), one in terms of cardinality and the other in 
terms of probability. 


Lower bound for Ramsey numbers. If ( )2!-() < 1, then R(a, a) > n. 


Proof 1 (Cardinality). Because each of the (5) edges of Kj, may be colored 
independently, the number of green-red colorings is 2 (2) The number of green— 
red colorings of Ky, with a monochromatic Kg is |U As], where Ag is the 


collection of green—red colorings in which the subgraph S is monochromatic and 
S ranges over all possible subgraphs of Ky, isomorphic to Kg. We bound |UAs| as 


follows: 


Us| < dU IAs| 


an (")20)-@) 


< 2(3), 
The first inequality is an enumeration estimate proved by induction on the 
number of terms in the union; it also follows from the inclusion—exclusion 
principle. The equality follows from the observation that there are ($) copies of 
Kg inside Ky. Since each Kg is monochromatic, there are two choices for the 
color of its edges. The remaining (5) — ($) edges of Ky are colored green or red 


arbitrarily. 


Now, because |UAs| is less than the total number of green—red colorings of Kn, 
there is a coloring which does not contain a monochromatic Kg. Therefore R(a, 
a)>n. 

Proof 2 (Probability). Suppose the edges of Kj, are randomly and independently 


colored green or red. Think of flipping a coin for each edge. If the coin lands 
heads, then the edge is colored green; if it lands tails, then red. For each 
subgraph S of Ky isomorphic to Kg, let As be the event that S is monochromatic. 


We have 


Pr(As) = Pr(S is green) + Pr(S is red) 
— 2-(3) 4+9-(3) 
— 91-(3). 
Therefore 


lA 


Pr (As) 


>. Pr(Ag) (subadditivity of probability) 
Ss 


i 91-(3) 


< 4, 
The complement of the event U Ag occurs with positive probability; hence there 


exists a desired configuration—a 2-coloring of K,, with no monochromatic Kg. 
Again, R(a, a) > n. 
The theorem contains an implicit lower bound for R(a, a), if we can untangle 
it. Fix a and let N be the minimum value of n satisfying (np1-@) > 1. Then 
R(a,a) > N 
= (N2)}V/e 


N l/a 
Cc) 

a 
(2(2)-1a1) "* 
(4.13) = ga/2-1/2—-1/a_yl/a 


From Stirling’s asymptotic formula for the factorial function, it follows that 


(a, a) > a2*/? leaf + o(1) ; 


V 


IV 


R 
(4.14) 
Finally, we have 


l/a 
lim inf R(a, a)!/* > lim inf {a2e/? 5 + o(t) } = V2. 
(4.15) ev2 


Combining this lower bound with our previously obtained upper bound, we 


obtain bounds on limg — oo R(a, a)! @ (if it exists): 
(4.16) V2  liminf R(a,a)"/* < lim R(a,a)'/* < limsup R(a,a)'/* < 4. 


These are the best bounds known at present. 


Open problem Determine whether limg -. 00 R(a, a)! @ exists and find its 
value. 


In 1995 J. H. Kim proved the first conclusive result about the growth of R(n, k) 
for fixed k. He showed that the order of magnitude of R(n, 3) is n2/logn. See [6]. 


Open problem. Determine the asymptotic behavior of R(n, 4). 


EXERCISES 


4.15 Obtain a lower bound for R(100, 100). 
4.16 Use the probabilistic method to prove that almost all labeled graphs 
have diameter 2; hence almost all labeled graphs are connected. 


4.17 Use the probabilistic method to prove Schiitte’s theorem For every m 
there exists a tournament T such that for each S C€ T, |S| = m, there exists a 
vertex p € T—S which is directed to each vertex of S. Find such tournaments 
form = 1 andm= 2. 
Hint: The tournament for m = 2 can be constructed from the set of 
quadratic residues modulo 7 as follows. Let Rp and Np be the set of 
quadratic residues and nonresidues modulo 7, respectively. Put a directed 
arrow from vertex i to vertex j if i—j € Rp and an arrow from j to i if i—j 
E Np. Check that this tournament has the desired property. 


Also prove that this tournament is unique up to isomorphism. 

Hint: First prove that every vertex has outdegree 3. Next prove that if 

vertex a is directed to vertices b, c, d, then b, c, d form a cyclic triple. 
Schiitte’s theorem was proved by Paul Erdés in 1963. 


4.5 Schur’s theorem 


In this section, we prove a proposition about equations as a corollary of 
Ramsey’s theorem, and in the next section we prove van der Waerden’s theorem, 
an elegant statement about arithmetic progressions. The theme is that of finding 
order in disorder. 

Given c, n= 1, we consider functions f: Ny > {A},...,Ac}. As usual, we think 
of the Aj, as colors and f as assigning a color to each integer, thereby partitioning 
Np into color classes. If S is a set of positive integers and f restricted to S is a 


constant function, then S is monochromatic. What kinds of monochromatic sets 
can we find given that n is large enough compared to c? One answer to this 
question was provided by Issai Schur (1875-1941) in 1916. 


Schur’s theorem. For each c > 1, there exists a least integer n = S(c) with 
the following property: For any function f: Ny > {Ayj,...,Ac}, there 


exists an Aj, containing x, y, z (with x = y allowed) such that x + y = z. In 
other words, there is a monochromatic solution to the equation x + y = z. 


Proof. Let m = R(3,...,3) — 1, where R(3,...,3) is the c-color Ramsey number 
that guarantees a monochromatic triangle. We claim that m has the desired 
property, and hence S(c) exists and satisfies S(c) < m. The function f: Ny > 


{Ay,....Ac} generates a c-coloring of the complete graph on vertices 1, 2,...,m + 


1 by assigning to edge ij the color that has been assigned to the integer |i — j|. The 
presence of a monochromatic triangle on, say, vertices a, b, c (a < b < c) implies 
that the equation x + y = z has the monochromatic solution (b — a) + (c — b) = (c 
—a). 

Although it is considered an important part of Ramsey theory, Schur’s theorem 
was introduced by Schur in an attempt to prove Fermat’s last theorem. See 
Notes. 

The integers S(c) are called c-color Schur numbers. It is trivial to observe that 
S(1) = 2 (1 + 1 = 2). We leave it to the reader to check that S(2) = 5 and S(3) = 
14. The only other known Schur number is S(4) = 45. Thus there is a general 
state of ignorance about Schur numbers, although they are linked to the equally 
mysterious Ramsey numbers by the inequality 


(4.17) S(e) < RG,...,3) -1. 


Open problem. Find the value of S(5). 


EXERCISES 


4.18 Prove S(2) = 5 and S(3) — 14. 

4.19 Suppose that we have a sum-free c-coloring of {1,...,n}, with the 
partition {Aj,...,Ac¢}. Then we can obtain a sum-free (c + 1)-coloring of {1, 
...,a + 1} as follows. For each i and each a ¢€ Aj include in A; the new 
element a + 2n + 1. Define Ag+; = {n + 1,...,2n + 1}. Show that this 
procedure works. What bounds does it give on S(c) for various values of c 
and in general? 

4.20 Let f(n) be the minimum number of triples (x, y, z) such that x + y = z 
and x # y when {1, 2,...,n} is 2-colored. Conjecture a formula for f(n). Such 
a formula was found by T. Schoen. 


4.6 Van der Waerden’s theorem 


Schur’s theorem states that any coloring function f: Ny — {Aj,...,Ac} forces a 
monochromatic solution to the equation x + y = z (whenever n is sufficiently 
large compared to c). What other monochromatic structures are forced? One 
direction for generalization is provided by Rado’s theorem, which asserts the 
existence of a monochromatic solution to the equation 

O12; + Qq%2 +°''++ Ont, =0 
as long as some nonempty subset of the aj, sums to 0. If this condition is met, we 
say that the equation is regular. For example, the equation 2x1 — 7x7 + 3x3 + 4x4 
— 6x5 = 0 is regular (-7 + 3 + 4 = 0). Another direction is provided by B. L. van 
der Waerden’s 1927 theorem concerning arithmetic progressions. An arithmetic 
progression of length / (or I-AP) is a sequence 

a,a+d,a+2d,...,a+(l—1)d 
of 1 numbers (integers, say), where each consecutive pair differ by a constant 
number d > 1. For example, the sequence 20, 30, 40, 50, 60, 70 is a 6-AP. Van 
der Waerden’s theorem asserts the existence of a monochromatic ]-AP when Np, 
is partitioned into c classes (and n is sufficiently large with respect to c and 1). 


Van der Waerden’s theorem (1927). Given c > 1 and! > 1, there exists 
a least integer W = W (c, 1) with the following property: If Ny is 
partitioned into c classes Aj,...,A;, then one of the classes Aj contains a 
monochromatic [-AP. 


Proof. The proof is by induction on c and I. In this proof, we use the notation [n] 
for Ny. The theorem is trivially true for some ordered pairs c, /, and in these 


cases we can actually determine the values of W(c, 1): W (1, 1) = 1, W(c, 1) = 1, W 
(c, 2) = c+ 1. The first and third of these statements are the basis of the 
induction. We shall assume the existence of W(d, 1) for every d and prove the 
existence of W(c, | + 1). The reader is encouraged to envision a table of c and I, 
and judge whether this plan would really cover all ordered pairs c, 1. We claim 
that W(c, | + 1) exists and satisfies W(c, | + 1) < f (c), where f is defined 
recursively: 
fQ) = 2W(e,)) 
(4.18) f(r) = (IY) f(n—-1), n>B2. 


As in the proof of Ramsey’s theorem, we are establishing an existence result by 
constructing an upper bound. However, the formulas in the upper bound grow 
too rapidly to furnish much insight into the exact values of W(c, 1). 

Suppose that [f (c)], which we call a c-block, is c-colored without a 
monochromatic | + 1-AP, and [f(c)] is partitioned into f(c)/f(c — 1) blocks of f (c 
— 1) consecutive integers, which we call (c — 1)-blocks. Likewise, each (c — 1)- 
block is partitioned into f (c — 1)/f(c — 2) blocks of f(c — 2) consecutive integers, 
which we call (c — 2)-blocks. This partitioning happens at each of the c levels, 
until, at last, each 1-block is partitioned into 2W (c, I) 0-blocks (which are just 
integers). 

By definition of W(c, J), the first half of each 1-block contains a 
monochromatic /-AR Here occurs the first leap of inspiration in the proof. The 
coloring of the elements of a 1-block induces a coloring of the 1-block itself. 


That is, we assign one of cf) colors to the 1-block according to the way its 
elements are c-colored. Because f(2) = 2w(cf), Df), each 2-block contains 


2W(cl (1), [) 1-blocks, so that, by definition of wel (1), I), the first half of each 2- 
block contains a monochromatic I-AP of 1-blocks. Similarly, the first half of 
each 3-block contains a monochromatic /-AP of 2-blocks. This construction 
happens at each level, so that the first half of [f (c + 1)] contains a 


monochromatic /-AP of c-blocks. Let us consider only those integers which lie 
in I-APs at all c levels of blocks. We coordinatize each integer as 

2% = (a4,...,2e); 
with 1 < x; <1, where x; is the position of x in the monochromatic /-AP of the i- 
block in which it resides. All coordinatized integers have the same color, say A}. 
Within each 1-block, the 1 integers 

(1 Rais egal (ly Sly: oxy Rely weg hy x's Ba) 
constitute a monochromatic /-AP. Therefore, the integer (/ + 1, x9,...,x¢) has a 
color other than Aj, say Ap. Furthermore, the factor 2 in the definition of f(1) 
implies that (1 + 1, x9,...,X¢) occurs within the 1-block. (The 2 is a convenient 
constant used to stretch the block enough to accommodate the (/ + 1)st term of 
an AP.) Here occurs the second leap of inspiration: the idea of focusing. Within a 
2-block, the | integers 

(8 +1, 1,29, +00) Fo), (E+ 1,2, a, 00» y Body ovey (E71, E+ 1, Spy 100¢8o) 
are a monochromatic /-AP of color Az. This forces (/ + 1,/ + 1, x3,...,X¢) to be a 
color other than Aj. However, we can focus a second /-AP on this integer, 
namely, 

OL VMs sing Mah [ky By Rages cg Maly waiy Ely by Mays cea 
Thus, (1+ 1,1 + 1, x3,...,x¢) cannot be color Aq or Aj; say it is colored A3. Figure 
4.3 illustrates the two focused progressions, representing colors Aj, Ap, A3 by 


dots, circles, and an x, respectively. (The dashes represent numbers with 
undetermined colors.) Continuing this focusing process at each of the c levels, 
we conclude that (/ + 1,/ + 1,...,1 + 1) can be none of the colors Aj,...,A¢, a 


contradiction. Therefore, there exists a monochromatic (/ + 1)-AP. 


Figure 4.3 Two 3-APs focusing on an integer. 


2-block 


The values of W(c, 1) are called Van der Waerden numbers. As we remarked in 
the proof, the inequality W(c, | + 1) < f (c) does little to establish good estimates 


for them. In fact, the state of knowledge is even worse for van der Waerden 
numbers than for Ramsey numbers. The seven known nontrivial van der 
Waerden numbers are listed in Table 4.2. The proof of one of these values, W(2, 
3) = 9, is called for in the exercises. 


Table 4.2 The known van der Waerden numbers W(c, !) with c > 2, 1 > 3. 


c 4 5 6 
2 35 178 1132 
3 293 


Open problem. Find W(5, 3). 


Although van der Waerden’s theorem asserts the existence of a 
monochromatic /-AP, it does not tell us which color it is. The following theorem, 
whose proof is beyond the scope of this book, guarantees the existence of a 
monochromatic /-AP in any color that occurs “with positive density.” We define 
the density function of a set S of positive integers to be 

i168 joe, 

(4.19) n 
The density function measures the fraction of the first n integers which occur in 
S. Clearly, 0 < d(S, n) < 1 for all S and n. 


Szemereédi’s theorem. For all real numbers d > 0 and all / > 1, there is a 
positive integer N(d, 1) with the following property: If n > N(d, 1) and S C 
{1,...,n} with d(S, n) = d, then S contains an /-AP. 


Paul Erdés (1913-1996) conjectured the above result in 1935, but it was not 
proved until 1975 by Endre Szemerédi. In 1977 Hillel Furstenberg gave a proof 
using ergodic theory. 


Conjecture. (Erdss) If {aj} C N and >> mn is a divergent series, then {a;} 
contains arbitrarily long arithmetic progressions. 


It is well known that x 1/p; diverges if {p;} is the set of primes (see [15]), and 
in 2006 Ben Green and Terence Tao proved that there exist arbitrarily long 
arithmetic progressions of primes. Erd6s’ conjecture is still open. In 2010 a 26- 
AP of primes was found: 


43,142,746,595,714,191 + 5,283,234,035,979,900n, O<n< 25. 


EXERCISES 


4.21 Prove W(2, 3) = 9. 
4.22 Find upper bounds for W(3, 4) and W(4, 4). 
4.23 Prove or disprove: If N is 2-colored, then there exists a monochromatic 
infinite AP. 
4.24 Prove or disprove: If R is 2-colored, then there exist a, b, c € R with a, 
b, c all the same color and (c — b)/(b — a) = v2. 
4.25 (Putnam Competition, 1960) Consider the arithmetic progression 
a,a+d,a+2d,..., 
where a and d are positive integers. For any positive integer k, prove that the 
progression has either no exact kth powers or infinitely many. 
4.26 Find a 6-AP of prime numbers. 
4.27 Prove that for any positive integers c and I, there exists a number W 
with the property that, whenever the set Ny, is c-colored, there exists an [-AP 
with each of its terms and the common difference the same color. 
4.28 Using the compactness principle, prove that the following theorem is 
equivalent to van der Waerden’s theorem: For all c, / > 1, no matter how N 
is c-colored, there exists a monochromatic I-AP. 
4.29 With the notation of Szemerédi’s theorem, suppose that there exists a 
density d < 1 such that N(d, 1) exists for all / => 1. Prove that M(d2, l) exists 
by showing that it satisfies 
N(d?,l) < N(d,l) - N(d, W(N(d,1),1)), 
where W is the van der Waerden function. Thus conclude that N(d, 1) exists 
for arbitrarily small d> 0 and all / > 1. 
4.30 Let rj; (n) be the greatest integer / such that there is a sequence of 
integers 1 < aq <-++-+aj<n which does not contain an /-AP. Prove that 
ry(m +n) <rp(m) +r-4(n). 
Prove that this implies that 
lim re(n) 


mn-+oo Tl 


exists for each k. 


Notes 


For original papers of Frank Ramsey, Paul Erdés and George Szekeres, and R. P. 
Dilworth, see [9]. 

Ramsey numbers have been generalized in many ways. For example, in 1972 
Vaclav Chvatal and Frank Harary defined the graph Ramsey number r(G, H) to 
be the minimum number of vertices in a complete graph which, when 2-colored, 
yields a green subgraph G or a red subgraph H. They showed that 

r(G, H) > (x(G) — 1)(p(H) - 1), 
where x(G) is the chromatic number of G and p(H) is the number of vertices of 
H. They used this inequality to prove r(Tj, Ky) = (m — 1)(n — 1) + 1, where Ty, 
is a tree with m vertices. See [11]. 

Schur’s theorem was proven by Issai Schur in an attempt to prove Fermat’s 
last theorem (FLT). Although Schur didn’t prove FLT, he did prove that, for all 


n, if p is prime and sufficiently large, then the congruence x" + y" =z" has a 
nonzero solution modulo p. Briefly, the argument is to suppose that p is prime 
and greater than S(n). Thus if { 1,...,9 — 1} is n-colored, there exists a 
monochromatic subset {a, b, c} with a + b = c. Let H = {x": xe Z* ph, a 
subgroup of Z*p of index gcd(n, p — 1) < n. The cosets of Z*p define an n- 
coloring f of Z*p such that f (a) = f (b) = f(c) anda + b = c. This implies that 1 


+a tb=ale= (in Zp); and in fact 1, a 1b, and ac are all nth powers in Zp. 


B. L. van der Waerden (1903-1996) proved his 1927 theorem as a 
generalization of the following conjecture of Schur: If N is partitioned into two 
classes, then one of the classes contains arbitrarily long arithmetic progressions. 

Ramsey’s theorem (in its various formulations) and van der Waerden’s 
theorem are usually thought of as the two cornerstone theorems of Ramsey 
theory. See [11] for a further discussion of these theorems and other theorems of 
Ramsey theory, including Gallai’s theorem, Rado’s theorem, Folkman’s 
theorem, and the Hales—Jewett theorem. 


CHAPTER 5 


ERROR-CORRECTING CODES 


Sixteen unit hyperspheres can be arranged in R’ so that each hypersphere is 
tangent to exactly seven of the other hyperspheres. 

This configuration of spheres is called a perfect packing. How do we obtain 
such a packing? What makes it perfect? What are its combinatorial properties? 
In this chapter and the next, we investigate such combinatorial designs, paying 
close attention to the interrelationships among the constructions and often 
finding equivalences between seemingly different structures. As a capstone, we 


construct the (23, 212 7) Golay code G23, the S(5, 8, 24) Steiner system, and 


Leech’s 24-dimensional lattice L. We begin our tour of combinatorial 
constructions with practical examples called codes. 


9.1 Binary codes 


Let F = {0, 1}, the field of two elements. Then F”, the collection of strings of 


length n over F, is a vector space of dimension n over F. We can picture F” as 
the set of vertices of the n-dimensional unit hypercube. For example, Figure 5.1 
depicts F 3 as the set of vertices of the cube. These vertices are coordinatized 
with the eight vector representatives 000, 001, 010, 011, 100, 101, 110, and 111. 
Note that two vertices are edge adjacent if and only if their vector 
representatives differ in exactly one coordinate. 


Figure 5.1 F? as the set of vertices of a cube. 


011 111 


3 110 


000 100 


Given any two binary strings v, w ¢ F”, we define the Hamming distance d(v, 
w) between v and w to be the number of coordinates where v and w differ. This is 
also the shortest edge path in the hypercube between x and y. For example, we 
can see in Figure 5.1 that d(011, 101) = 2. 

The function d is a metric, which we call the Hamming metric or Hamming 


distance, and F" is a metric space. 


Theorem. The function d is a metric on F"”; that is, for all v, w, x ¢ F", the 
following properties hold: 


(1) d(v, w) = 0 with equality only when v = w (positivity); 

(2) d(w, v) = d(v, w) (symmetry); 

(3) d(v, w) + d(w, x) = d(v, x) (triangle inequality). 
Proof. Properties (1) and (2) are immediate from the definition of d. We prove 
the triangle inequality by verifying that it is preserved componentwise. Let vj, 
wj, Xx; be the ith components of the vectors v, w, x, respectively. If vj = wj = Xj, 
then the contribution to both sides of the inequality is 0. If not, then the 
contribution to the left side is at least 1 while the contribution to the right side is 
at most 1. Hence, the inequality is preserved componentwise. (It is also easy to 
“see” that d is a metric by realizing that d(x, y) is the shortest path length 
between x and y along the edges of a cube embedded in R"™.) 

The weight w(v) of a vector v is the number of 1's in the vector representation 

of v. A simple componentwise proof demonstrates that d(x, y) = w(x — y) for any 
x, ye Fl, 


The Hamming sphere with radius r and center c is the set of all v ¢ F” such 


that d(v, c) < r. The volume of the sphere is the number of elements in it: 
> e=o (7). Note that (3) counts selections of the k coordinates in which c and v 
disagree. 

It is difficult to picture spheres when n is large (and they don’t look very 
spherical). We will mainly be interested in how densely they can be packed, 
because, as we shall see, dense packings signify good codes. 


A code A is a subset of F” with |A| > 2. The elements of A are called 
codewords. In real-life applications, information can be sent reliably over a 
noisy channel by encoding redundancy in the message. A codeword v «€ A is 
transmitted and a possibly distorted vector v’ is received. As it might happen that 
v’ equals a codeword in A different from v, it is not always possible to tell 
whether any errors have been committed in the transmission. However, if the 
Hamming distances between pairs of codewords are fairly large (which is 
achievable with redundancy), it is unlikely that v’ will equal another codeword. 
If the Hamming distances are large enough, it may be possible to detect errors 
when they occur and correct them. 

The distance d(A) of a code A is the minimum Hamming distance between 


distinct codewords in A. For example, the code A = {011, 101, 110} CF 3 has 
distance d(A) = 2, because any two of the vectors in A differ by two bits. It is 
always true that 1 < d(A) <n. 

A code with distance d = e + 1 detects e errors. A code with distance d = 2e + 
1 corrects e errors. To justify these definitions, suppose that a codeword v is sent 
and a string v’ is received, with 1 < d(v, v’) < e. If d= e+ 1, then v’ cannot equal 
some codeword x or else d(v, v’) => e + 1, a contradiction. Therefore, we can 
detect that at least one error has occurred. If d = 2e + 1, then v’ cannot have 
resulted from the transmission of an erroneous codeword x (and at most e 
errors), or else 2e+1 < d(v, x) < d(v,v’')+d(v',z) <e+e = 2e, a contradiction. 
Therefore, we can identify the particular vector v that was sent and correct the 
errors. 


EXAMPLE 5.1 Triplicate code 


The code {000, 111} € F? is called a triplicate code. Under the map 0 + 
000 and 1 +» 111, each bit is tripled. If an error occurs (a bit is switched 
from 0 to 1 or 1 to 0), then we can still tell which message was intended. 
Thus, the code has distance 3 and is capable of correcting one error. If a 


codeword is transmitted and we receive 001, then, under the assumption that 
at most one error has been committed, the intended word must be 000. Let q 
be the probability that a bit is mistakenly altered from 0 to 1 or from 1 to 0, 
and suppose that bits are altered independently of each another. The 
decoding scheme fails if two or three errors occur, and this happens with 
probability 3q2(1 —q)+ q°, which is asymptotic to 3q2 for q small. This 
probability of failure is much smaller than the probability q of failure when 
no code is used. However, the increase in reliability is paid for by a decrease 
in transmission rate. The rate r(A) of a code A, defined as r(A) = log9 |A\/n, 
measures the amount of information per code bit conveyed over the 
communication channel. Always, 1/n < r(A) < 1. In the above example, r(A) 
=- (log? 2)/3 = 1/3, which means that when information is sent in triplicate 


the rate decreases by a factor of 3 (which is reasonable). 


We refer to a code A C F" with distance d(A) = d as an (n, |A|, d) code. The 
number n is sometimes called the dimension of the code, and |A| is called the size 
of the code. Let there be no confusion between the distance d (an integer) and 
the Hamming metric d (a function). 

For n fixed, there is an inverse relationship between |A| and d (and therefore 
between the rate and the error-correcting capability of the code). The 
fundamental problem of coding theory is to find codes with high rate and large 
distance. The next theorem gives sharp focus to this problem. 


Hamming upper bound. If A corrects e errors, then 
on 


A 
(5.1) bas (i) 


Proof. Because d > 2e + 1, the spheres of radius e centered at the codewords do 
not intersect. Therefore, the total volume of the spheres is at most the cardinality 


of F"; that is, 


e n - 
|A| >> (i) a 
k=0 


from which the upper bound follows instantly. 
The Hamming upper bound for |A| is also called the sphere packing bound. 


EXERCISES 


5.1 Prove that d(x, y) = w(x — y) for any x, ye F™. 
5.2 Prove that d(x, y) = d(x + z, y +z) for any x, y, z€ F". 
5.3 Prove that w(x + y) = w(x) + w(y) — 2x: y for any x, y€ F”. 


5.4 Find a code in F4 with eight codewords and Hamming distance 2. How 
many errors can this code detect? 


5.2 Perfect codes 


If the Hamming upper bound is achieved for a code A, we say that A is perfect. 
Perfect codes correspond to sphere packings of F” with no wasted space (vectors 


not in a sphere). If A is perfect, then }°;_, (j) must divide 2” and hence be a 
power of 2. We will soon see that this rarely happens. 

If |A] = 2, then the maximum value of d(A) is n and this value is achieved, for 
instance, when A consists of the all-O0 vector and the all-1 vector. If |A|] > 2, then 
some two vectors must agree on any given component, so d(A) < n. Therefore, n 
> d > 2e + 1, which implies that 

n-1 


GE). s 


Ife = 1, then S\e_5 (2) = ($) + (7) = 1 + 2, which is a power of 2 when n = pig 


1 for some r > 2. In this case, |A| = 27/27 = 2% TF = 2M wherem=n-—r=2!-1 
—r. Hence, the parameters of such a code are (n, |A|, d) (2" — 1, 2, 3). 

The next theorem says that there are only two feasible sets of parameters for 
perfect codes when 1 < e < (n — 1)/2. This was proved by Aimo Tietavdinen and 
J. H. van Lint (1932-2004). Unfortunately, no simple proof of this fact is known. 
Their proof uses the theory of equations and is quite complicated. See [22]. 


Theorem. The only values of n and e for which 1 < e < 5 and )oy~» (%) 
is a power of 2 are (n, e) = (23, 3) and (90, 2). 


These values correspond to two special sums of binomial coefficients: 


23 23 23 
= gil 
+ (G) +) +(3) 
90\  /90\ _ aio 
1+ (2) + (9) =. 
If there are codes with these values of n and e, they would have parameters (n, 


IA], d) = (23, 212, 7) and (90, 278. 5). We outline a proof in the exercises that 


there is no code with parameters (90, 278. 5). A code with parameters (23, 212. 
7), called the Golay code G23, is constructed in Section 6.7. 

We have taken the base field of our codes to be GF(2), but if we allow other 
base fields, it turns out that there is only one more perfect code, the ternary code 
G11 with parameters (11, 36, 5), discovered by Marcel Golay. If we consider 
any alphabet as the base set (not necessarily a field), then a perfect code is one of 
the two Golay codes G73 and Gj 1 or else has the parameters of a Hamming 
code. 


In the next section, we will describe the Hamming codes, a family of perfect 1- 
elror-correcting codes. 


EXERCISES 


5.5 Prove that the maximum number A(n, e) of codewords in an e-error- 
correcting code in Fy satisfies the Gilbert lower bound 
Ade 
A(n,€) 2 =ae—7m- 
Br (i) 
5.6 Show that there exists no code A € GF(2)19 with 19 words and d(A) = 5. 
5.7 Use a computer to verify that the only ordered pairs (n, e) with 2 <e<(n 
— 1)/2 < 50 and $°4_» (%) a power of 2 are (23, 3) and (90, 2). 
5.8 Prove that there is no (90, 278. 5) code. 

Hint: Suppose that there is such a code. Without loss of generality, we 
can assume that the code contains the zero vector (why?). The code, 
being perfect, corresponds to a sphere packing of F 90 with 278 spheres of 
radius 2. Let X be the set of weight 3 vectors in F 90 which have 1's in the 
first two components. Show that X has 88 elements. How are the elements 
of X partitioned by spheres around codewords of weight 5? 


5.9 Show that (11, 36, 5) are feasible parameters for a perfect ternary code. 


9.3 Hamming codes 


As in the previous section, we define n = 2’ — 1 and m= 2/ —1—r, wherer => 2. 
If r = 2, then n = 3 and m = 1, and the vertices 000 and 111 of the cube C3 of 
Figure 5.1 constitute such a code. We now describe a code A © F" with |A| = 2!” 
that corrects one error, exhibiting a construction in the case r = 3, n = 7, m= 4. 
The constructions for r > 3 are carried out similarly. 

Let 
ia ft 
toe dl 
a a | 


H= 


(5.3) 
be the r xX n = 3 X 7 matrix whose columns are the numbers 1,...,n written in 


o oa -_ 
o - & 
—_ © © 


J 
1 
0 


binary. The matrix H represents a linear transformation from F 7 to F2: 

H: F’ — F8 

v-— Hv. 

We define the code A to be the kernel of H; that is, 

(5.4)A={ve F’ : Hv =0}. 
(The 0 here is the 3 x 1 zero vector.) We call H a parity check matrix for A. Any 
code A for which x € A and y€A imply x + ye Ais called a linear code. Clearly, 
a code described as the kernel of a parity check matrix is a linear code. 

By inspection, we find two of the vectors belonging to A: 


0 1 
0 l 
0 l 
0 and O}. 
0 0 
0 0 
LO 0 


As the Hamming distance between these two vectors is 3, we see that d(A) < 3. 
We need to prove that d(A) > 3 and |A| = 2" = 16. 


Assume that v= [x yazbc d]f € A. (The reason for the nonalphabetical listing 
of the components of v will become clear in a moment.) Because v is in the 
kernel of H, we have Hv = 0. Thus 


Ot) Oe bg 1 b 0 
Cc 
(5.5) d 
which yields three equations 
r+a+b+d = 0 
ytat+c+d = 0 
z+b+c+d = 0 


Because we are working in, F, we have —x = x for all x, and the equations 
becomes 


zr = at+bdd 
y = atct+d 
(5.6)2 = b+e+d. 


The variables a, b, c, d may independently take either value, 0 or 1, in F. For 


this reason they are called free variables. There are 24 choices for the values of 
the four free variables. The variables x, y, z are determined by these choices and 
are therefore called determined variables. The values of the free variables are 
called information bits, while the values of determined variables are called check 
bits. 

We have shown that |A] = 16. It remains to prove d(A) = 3, which we do by 
showing that A corrects one error. Suppose that a codeword c € A is sent and one 
error occurs. Assuming that the error occurs in the ith component, we represent 
the error by a vector consisting of a single 1 in the ith position and 0's in all other 
positions: 


e= |1]| +— ith position 


0 
The received vector is c + e (which differs from c in just the ith component), 
and it is the decoder’s job to determine the position in which the error has 
occurred. This is done by exploiting properties of the matrix H. We multiply H 
bycte: 
Hic+e) = Hc+He 
= 0+He (by definition of A) 
= He: 
Since e has only one nonzero row, the product He consists of the column of H 
corresponding to this row position. In other words, He equals the ith column of 
H. Because of the way H is constructed, this is the number i in binary. Thus, 
when we compute H(c + e) the position of the error is revealed (in binary). We 
have demonstrated that A corrects one error, so d(A) >2-1+1=3. 
We have already remarked that A is a perfect code. As a sphere packing, A 
may be pictured as the centers of the spheres (in Euclidean space) referred to in 


the introduction to this chapter. As we have already noted, every element of F 3 
lies in exactly one sphere. This Hamming code (of dimension 7) has rate r(A) = 
(log> 16)/7 = 4/7. In general, the Hamming code of dimension n has rate 


logg2™ m 2° -l—-?r 
(5.7) ioc "me ood! 
which tends to 1 as r tends to infinity. Thus, the Hamming codes are a family of 
1-error-correcting codes with arbitrarily good rate. 
The equations of the previous section allow us to determine the 16 codewords 


of the (7, 16, 3) Hamming code. They are listed below: 


0 1 0 if 1 0 1 0 
0 ie 1 0 0 1 1 0 
0 0 0 0 0 0 0 0 
0 1 1 0 l 0 0 1 
0 0 0 0 1 1 1 1 
0 0 1 1 0 0 1 1 
0 1 0 1 0 l 0 1 
1 0 1 0 0 1 0 1 
1 0 0 1 1 0 0 1 
1 1 1 1 1 1 1 1 
0 1 1 0 1 0 0 1 
0 0 0 0 1 1 1 1 
0 0 1 l 0 0 l 1 
0 1 0 1 0 1 0 1 


Since d(x, y) = w(x — y) for any x, y € F", the distance of a linear code equals 
the minimum weight of a nonzero codeword. Table 5.1 tallies the words of the 
(7, 16, 3) Hamming code according to weight. 


Table 5.1 Weight distribution of the Hamming code. 
weight 0 3 4 7 
numberofwords 1 7 7 1 

The symmetry of the weight distribution is due to the fact that A is a linear 
code containing the all-1 codeword. Thus, if v € A, then also v“ € A (where v° is 
the binary complement of v) and w(V“) = 7 — w(v). 

The seven Hamming codewords of weight 3 give rise to a finite geometry 
called the Fano Configuration (FC). The Fano Configuration has seven points, 1, 
2, 3, 4, 5, 6, 7, corresponding to the seven components of the code vectors of A. 
Each codeword of weight 3 contains, by definition, three 1's. The three points 


corresponding to the 1's are joined by a line, called an edge of FC. The edges are 
246, 167, 145, 257, 123, 347, and 356. 


EXERCISES 


5.10 Explain how sixteen 7-dimensional unit hyperspheres cap be arranged 


so that each hypersphere is tangent to exactly seven of the other 
hyperspheres. 

5.11 How reliable is the (7, 16, 3) Hamming code? Suppose that q = 0.05 
(the probability of an error in a single bit of the message). Compare the 
probability that exactly one error occurs (and hence the code corrects the 
error) with the probability of an error occurring if no code is used. 


5.12 Find a code A € GF(2)® with d(A) = 4 and r(A) = 1/2. 


5.13 Let A be the (15, git 4) Hamming code. Suppose that v € A is sent, at 
most one error occurs, and w = 101000000000000 is received. Find v. 

5.14 Show how to create a cyclic (7, 16, 3) Hamming code by making the 
transformation [xyazbcd] — [yxzabdc]. In a cyclic code, cyclic shifts of a 
codewords are also codewords. 


2it 4) Hamming 


5.15 Use a computer to generate the elements of the (15, 
code. 
5.16 Seven players play a game as a team. They are in a room wearing hats 
of one of two colors, white or black. Each person can see the hats of the 
others but not his or her own hat. Each person’s hat color is chosen 
randomly, with probability of white and black hats equally likely and 
independent of the other players’ hats. Simultaneously, the players guess 
their hat colors or “pass” if they don’t want to guess. The team wins if at 
least one person guesses correctly and no person guesses incorrectly. The 
players are allowed to discuss possible strategies before they enter the room. 
What is their best strategy and what is their probability of winning? 
The general version of this problem (with n players) is due to Todd 
Ebert, who proposed it in his Ph.D. thesis at the University of California 
at Santa Barbara in 1998. 
5.17 Prove that the probability that the determinant of a random n x n matrix 
over a finite field is 0 tends to a constant as n tends to infinity. 


9.4 The Fano Configuration 


The simplest combinatorial construction is known as the Fano Configuration, 


named after the geometer Gino Fano (1871-1952)4. The Fano Configuration is 
the prototypical example of many types of structures, such as projective planes, 


block designs, and difference sets. Let’s look at the configuration and observe 
some of its properties. 
The Fano Configuration (FC) is shown in Figure 5.2. 


Figure 5.2 The Fano Configuration (FC). 
1 


2 6 4 
The points of FC are labeled 1, 2, 3, 4, 5, 6, and 7. The lines are just unordered 
triples of points (e.g., {1, 2, 3}); they have no Euclidean meaning Hence, the 
points may be located anywhere in space, and the lines may be drawn straight or 
curved and may cross arbitrarily. 
We observe that FC has the following properties: 
1. There are seven points. 
2. There are seven lines. 
3. Every line contains three points. 
4. Every point lies on three lines. 
5. Every two points lie on exactly one line. 
6. Every two lines intersect in exactly one point. 
The above properties occur in pairs called duals. If the words “point” and 
“line” are interchanged, each property is transformed into its dual property. 
FC has many fascinating properties. For instance, with its edges properly 
oriented, FC represents the multiplication table for the Cayley algebra. Here is a 
simple application. 


EXAMPLE 5.2 


Seven students are asked to evaluate a set of seven textbooks, making 
comparisons between pairs of textbooks. While it is possible for every 


student to read every book and write comparisons between each pair, this 
would be time consuming. Instead, each student receives three books to 
read, according to the FC diagram. Number the students 1, 2, 3, 4, 5, 6, 7. 
The books correspond to the lines and therefore can be numbered {1, 2, 3}, 
{2, 4, 6}, etc. Then, for example, student 2 receives books {1, 2, 3}, {2, 4, 
6}, and {2, 5, 7}. Each student writes a comparison between the three pairs 
of books that he or she reads. Thus, each student must write only three 
comparisons, instead of 21. 
We next determine the automorphism group of FC, denoted Aut FC. This 
automorphism group is the group of permutations of the vertices of FC that 
preserve collinearity. For example, if we rotate FC clockwise by one-third of a 
circle, collinearity is preserved. This permutation is written in cycle notation as 
(142)(356)(7). 

We begin by calculating the order of Aut FC. It is evident from Figure 5.2 that 
all seven vertices are equivalent in terms of collinearity. Therefore, vertex 1 may 
be sent by an automorphism to any of the seven vertices. Suppose that 1 is 
mapped to 1’. Vertex 2 may be mapped to any of the remaining six vertices. 
Suppose that 2 is mapped to 2’. In order to preserve collinearity, vertex 3 must 
be mapped to the unique point collinear with 1' and 2’. Call this point 3’. Vertex 
4 is not on line 123, so its image 4’ can be any of the remaining four vertices. 
Finally, the images of the other points are all determined by collinearity: 5 is 
collinear with 1 and 4; 6 is collinear with 2 and 4; and 7 is collinear with 3 and 4. 
Hence, there are 7-6-4 = 168 automorphisms. 

Now we know that Aut FC is a group of order 168, but what group? Is it 
abelian? Cyclic? Simple? We will show that Aut FC is isomorphic to the group 
of invertible 3 x 3 matrices over F. 

Here is some background about groups of matrices. The general linear group 
GL(n, q) is the set of invertible n x n matrices with coefficients in the Galois 


field GF(q) of order gq = pk, where p is a prime, under matrix multiplication. The 


order of GL(n, q) is readily determined. There are q! — 1 choices for the first row 
of an invertible n x n matrix (the all-O0 row is excluded). Having chosen the first 


row, the second row may be any of the q” possible n-tuples except the q scalar 
multiples of the first row. Hence, there are q' — gq choices for the second row. 
Similarly, the third row may be any of the q”™ n-tuples except linear 


combinations of the first two rows. There are q” — q? choices. Continuing in this 


manner, we alrive at the total number of invertible matrices: 
(5.8) |GL(n, a)| = (a" — 1)(q" — a)(q" - 9’) ---(@" — 9"). 
For example, GL(2, 2) has six elements and is in fact isomorphic to S3. The 
group Aut FC is isomorphic to GL(3, 2). 
The special linear group SL(n, q) is a subgroup of GL(n, q) consisting of those 
n X n invertible matrices with entries from GF(q) and determinant 1. We claim 
that SL(n, q) is a normal subgroup of GL(n, q). For if M € SL(n, q) and Ne 
GL(n, q), then 
det(NMN~') = det N det M det N~* = det Ndet N~* = 1. 
Now consider the homomorphism 
f: GL(n,q) —+ GF(q) \ {0} 
M +— det M. 
The kernel of f is SL(n, q) and the homomorphism is clearly onto. Therefore, by 
the first homomorphism theorem for groups, 
(5.9) GL(n, q)/SL(n, q) = GF(q) \ {0}, 
and hence, 
faites wile eal a 
(5.10) Ua 
Projective versions of the groups GL(n, q) and SL(n, q) are obtained as the 
quotient groups of these groups by their centers. The projective general linear 
group PGL(n, q) is GL(n, q)/Z(GL(n, q)) and the projective special linear group 
PSL(n, q) is SL(n, q)/Z(SL(n, q)). In the exercises, the reader is asked to prove 
the following formulas: 
Blind ee wes i oie eee 
(5.11) — 
and 
PSL(n, q)| = VW = ae" a itin iy 
(5.12) d{q—-1) 
where d = gcd(n, q — 1). 
For all n > 1, we have 
GL(n, 2) ~ SL(n, 2) ~ PGL(n, 2) ~ PSL(n, 2). 
When n = 3 we obtain 


3 2 3 _ 92 
IGL(3, 2)| = RE IE) 76-4 = 168, 


We will now show that Aut FC is isomorphic to GL(3, 2). 


We label the vertices of FC with the seven nonzero vectors in F%, as in Figure 
5.3. This labeling is derived from the labeling of Figure 5.2 by assigning to each 
vertex i the vector that represents the number i in binary. The vectors have been 
chosen so that vq, v2, v3 are collinear if and only if vj + v7 +, v3 =0 Gn F 3), The 
matrix group GL(3, 2) acts on the vectors of FC in the obvious way: v + vM. It 
is easy to check that this action preserves collinearity: vj + v2 + v3 = 0 if and 
only if (vj + v2 + v3)M = 0°: M, which is true if and only if v}M + voM + v32M = 
0. Therefore, Aut FC is isomorphic to the group of 3 x 3 invertible matrices over 
F, 


Figure 5.3 A vector representation of FC. 
001 


011 101 


010 L270 100 

It is well known (see [23]) that PSL(n, q) is a simple group (a group with no 
nontrivial normal subgroups) for all n > 2 except n = 2 and gq = 2 or g = 3. 
Because there is only one nonabelian group of order 6, it follows that PSL(2, 2) 
~ S3. It is also easy to show that PSL(Q, 3) ~ Ay. The simple groups are crucial 
to the study of algebra. Some of them are difficult to describe, although we have 
shown that PSL(3, 2) (~ GL(3, 2)) has a nice geometric model. 

Let us find generators for Aut FC. By direct calculation, we see that the rows 
of M are the images of the binary representations of 1, 2, and 4. Let 


01 1 1 0 0 
S=]1 0 0 and T=j|0 0 1 
1 0 1 0 1 0 


We observe from Figure 5.3 that T yields a reflection of FC around the 374 axis 
while S yields the 7-cycle (1 5 6 7 2 4 3). We will show that all 168 matrices in 
the automorphism group of FC are generated by combinations of S and T. 


The “visible automorphisms” of FC are the symmetries of its equilateral 
triangular shape given by the elements of the symmetric group S3. These 
symmetries are combinations of S and T, for T is a reflection of the triangle (i.e., 
a transposition of two of its vertices) and (S2TS2)2T is a rotation of the triangle 
by one-third of a circle; all symmetries of the triangle are combinations of a 
reflection and a rotation. Furthermore, STS2T is a transvection (the identity 
matrix with an extra 1 in an off-diagonal position). From the transvection, we 
can produce all transvections via conjugation by permutations. Using elementary 
row operations, all invertible matrices can be formed from permutation matrices 
and transvections. Therefore, all invertible matrices are combinations of S and T. 

A presentation of the group is 

G=itstha’ =f =f = (st) = 2. 

The matrix ST has order 3 (but isn’t a rotation). Let U = (ST) 1. Then the 

group is a homomorphic image of the “triangle group” generated by S, T, and U: 
Ghee! =a iT); 

This is the (infinite) group of symmetries of a tiling of the hyperbolic plane 
with triangles whose angles are 7/2, 7/3, and 7/7. This tiling is shown in Figure 
9.4. 


Figure 5.4 A triangle tiling of the hyperbolic plane. 
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EXERCISES 


5.18 Find a design, other than FC, with seven points and seven lines, each 


line containing three points, and each point on three lines. 
5.19 (Sylvester’s problem) Can you draw FC in the plane with straight lines? 
This exercise will show that it is not possible. 
Suppose that X is a finite set of points in the plane with the property that 
every line determined by two points of X contains a third point of X. 
(a) Prove that X is collinear. 
(b) Show that the assertion in part (a) is false if X is infinite. 
5.20 Prove that 
(2” — 1)(2" — 2)(2" — 22)(2" — 28)...(2" — 2-1) 
n! 
is an integer for alln < 1. 


5.21 Prove formulas (5.11) and (5.12). 
5.22 Show that (s2 TS2)2T is a rotation of FC by one-third of a circle and 


STST is a transvection. 
5.23 Prove that Gauss’s g-binomial coefficient 
n} _ (g*-1)(@"- g(a" - 9) ---(@" - 9°") 
hk}, (a — 1)(a* — a)(a* — 9”) ---(@* — a") 
is equal to the number of k-dimensional subspaces of a vector space of 
dimension n over a field of size q. 
5.24 Show that the elements of GL(3, 2) have orders 1, 2, 3, 4, or 7. How 
many elements of each order are there? 
5.25 What group is Aut(Z 9 + Zo + Zp)? 
5.26 Prove that PSL (7, 2) ~ GL(3, 2). 
Hint: Use a group presentation. 
5.27 Prove that the automorphism group of the Hamming code of length n = 
2" — 1 is isomorphic to GL(r, 2). 
5.28 Prove that the group 
G=({saitia =P = (A)? = (sit = 1) 
has 168 elements. 


Notes 


Richard Hamming (1915-1998) developed the Hamming codes in 1947 at Bell 
Telephone Laboratories in order to solve the problem of glitches in running 
computer programs. Marcel Golay (1902-1989) did much of the same work 
independently at the Signal Corps Engineering Laboratories. For a history of 
their achievements the reader is referred to [27]. Today, error-correcting codes 
are used in the design of compact discs (CDs), satellite communications, ISBN 
numbers, and bar code scanners. Error-correcting codes are an important part of 
the science of information theory introduced by Claude Shannon (1916—2001) in 
1941 (also at Bell Labs). The main ideas of information theory are that an 
information source contains a certain amount of uncertainty (entropy) and that 
entropy determines the accuracy and amount of information that can be sent over 
a communication channel. In a memoryless source of English characters, for 
example, the entropy is thought to be about 4.03 bits (each revealed character 
conveys about 4.03 bits of information). As a baseline comparison, the entropy 
of a memoryless source of 27 equally likely symbols (26 letters and a space) is 
approximately 4.76 bits. (By the way, in a popular word-making board game the 
distribution of letter tiles gives an entropy value of about 4.32 bits.) However, if 
knowledge of the English language is used to predict the likelihood of future 
characters (the source has a memory), then the entropy is believed to be between 
0.6 and 1.3 bits. Knowledge of the information structure of a language is useful 
in the design of machines that translate the spoken word into written text 
(continuous speech recognition). 

A linear code has a generating matrix. The (7, 16, 3) Hamming code, as 
constructed in this chapter, is generated by the 7 x 4 matrix 


Lk ©@ A 
, oo 
1 00 0 
G=/0 1 1 1|. 
bt OG 
001 0 
00 0 1 


The code is the range of the corresponding linear transformation, the set of 


vectors GveE F a where vé F4. 
For a detailed introduction to error-correcting codes, a good source is [22]. 


1 In Fano’s geometry, what we call the Fano Configuration was specifically 


excluded. 


CHAPTER 6 


COMBINATORIAL DESIGNS 


The Fano Configuration FC is the simplest nontrivial example of many types of 
combinatorial configurations, including t-designs, Steiner systems, block 
designs, and projective planes. We next explore these designs and investigate 
their interconnections, paving the way for the construction of the Golay code 
G73, the only perfect binary code capable of correcting more than one error, and 


Leech’s remarkable 24-dimensional lattice. 


6.1 t-designs 


A t-(v, k, A) design (or t-design) consists of a v-set S and a collection C of k- 
subsets of S, with the property that every t-subset of S is contained in exactly A 
members of C. The elements of S are called points and the elements of C are 
called blocks. 

A t-design is nontrivial if 0 < t < k < vand not every k-subset of S is a block. 


EXAMPLE 6.1 


The lines 246, 167, 145, 257, 123, 347, and 356 of FC are the blocks of a 2- 
(7, 3, 1) design with S = {1,...,7}. 


EXAMPLE 6.2 


The complement FC’ of FC has the same set of vertices as FC. Its lines are 
the complements of the lines of FC: 1357, 2345, 2367, 1346, 4567, 1256, 
1247. It is easy to verify that the lines of FC’ constitute the blocks of a 2-(7, 
4, 2) design. 


EXAMPLE 6.3 


The following sets are the blocks of a 3-(8, 4, 1) design: 
1357 2345 2367 1346 4567 1256 1247 
2468 1678 1458 2578 1238 3478 3568 


These blocks are of two types: (1) the lines of FC’ and (2) the lines of FC 
joined to a new element 8. The derived design obtained by removing any 
point and all the sets not incident with it is equivalent to FC. Conversely, we 
call the 3-(8, 4, 1) design an extension of FC. 


EXAMPLE 6.4 


A graph with p vertices and q edges is a 0-(p, 2, q) design. An r-regular 
graph is a 1-(p, 2, r) design. An r-regular k-uniform hypergraph is a 1-(p, k, 
r) design. 
We say that two t-designs are equivalent if they can be made the same by 
relabeling their underlying sets. Each of the nontrivial designs above is unique 
up to this equivalence. One reason for the uniqueness is that many parameters of 
a design are determined by the following theorem. 


Parameter theorem. Given a t-(v, k, A) design with point set S and block 
set C and 0 <i <t, there exists a constant A; such that every i-set of S lies 


in exactly A; elements of C. Therefore, the design is an i-(v, k, A;) design. 
Furthermore, A; satisfies ;(*~) = A(¥7"). 


Proof. Let X be a fixed subset of S with |X| = i, and consider the ordered pairs (T, 
K) with |T| = t, Ke Cand X € T C K. As in the proof of Burnside’s lemma, we 
count the ordered pairs in two ways (from the perspective of each coordinate) to 
obtain A; (*—') = A(?"'), a relation independent of X. 


We solve the parameter equation for Aj: 


v3 


i = AsEE. 
(6.1) (-3) 
Putting in the value i = 0, we obtain the number b of blocks in C: 
b= = yl). 
(6.2) (*) 


Putting in the value i = 1, we obtain the number r of times each element of S 
occurs in a block: 
ee. 
t—1 
r= At = A k-1\° 
(6.3) nae, 


The reader should verify the formulas for b and r in the above examples. 


An S(t, k, v) Steiner system is a t-design with A = 1. Later, we construct the 
Golay code G93 via a related code that contains an S(5, 8, 24) Steiner system. 


No Steiner system is known with t > 5. The only known Steiner systems with t = 
5 are S(5, 6, 12), S(5, 6, 24), (5, 6, 36), S(5, 6, 48), S(5, 6, 72), S(5, 6, 84), S(5, 6, 
108), S(5, 6, 132), S(5, 6, 168), S(5, 6, 244), S(5, 7, 28), and S(5, 8, 24) designs, 
and the only known Steiner systems with t = 4 are derived from these designs. 
Open problem. Determine whether there is a Steiner system with t > 5. 


A Steiner triple system is a Steiner system with k = 3. 


EXAMPLE 6.5 
FC is a S(Q2, 3, 7) Steiner triple system. 


EXAMPLE 6.6 
An S(2, 3, 9) Steiner triple system is given by the set of blocks 
{123, 456, 789, 147, 168, 159, 258, 369, 249, 357, 267, 348}. 


This Steiner system is equivalent to the set of nonideal points and lines of 
the projective plane of order 3. 


EXAMPLE 6.7 


Show that the weight 3 codewords of the Hamming code of length 15 form a 
Steiner triple system. What are its parameters? 


EXAMPLE 6.8 


The 81 cards of the SET® game are the points of an S(2, 3, 81) Steiner triple 
system. The blocks are the 1080 possible sets. Each block contains three 
points and each point is contained in exactly 40 blocks. 

We can define further parameters for a t-(v, k, A) design. 


Double-parameter theorem. Given a t-(v, k, A) design with points S and 
blocks C, and i and j nonnegative integers satisfying i + j < t, there exists 
a constant Ajj such that the number of blocks which contain all the 


elements of any fixed i-set of S and omit all the elements of any fixed j- 
set of S (the i-set having no elements in common with the j-set) is exactly 


Ajj. Furthermore, Ajj satisfies 


fu-t\_\fv-i-j 
wi(g 4) =A k-i ) 


Proof. From the parameter theorem and the inclusion—exclusion principle, we 
obtain 


v—(i+s) 
wl fag 1)° he (ite) (Gj (? i} 
s=0 (i. ) $ 
It follows that 
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EXAMPLE 6.9 
Recall that FC is a 2-(7, 3, 1) design. Applying the double-parameter 
theorem, we determine the constants Agg = 7, A19 = 3, Agi = 4,A29 = 1,411 
= 2, Ag? = 2. For example, the relation A,1 = 2 says that, given any points x 
and y of FC, there are precisely two lines which contain x and omit y. 


EXERCISES 


6.1 Draw the S(2, 3, 9) Steiner system. 
6.2 Construct a 2—(21, 5, 1) design. 
6.3 Prove that in an S(4, 5, 11) Steiner system no two blocks are disjoint. 


Prove that in an S(5, 6, 12) Steiner system the complement of any block is a 
block. 


Hint: Let {a, b, c, d, e} be a block in the S(4, 5, 11) Steiner system. Let 


A be the set of blocks containing a, B the set of blocks containing b, etc. 
Use the inclusion—exclusion principle and knowledge of the values of Aj; 
to find |A UB UCUD U E|. Solve the problem about S(5, 6, 12) 
similarly. 
6.4 Prove that the double parameters Ajj satisfy the relations Ajg = Aj and 
A(i-1)j = Ai-1)G-1)~Ai(j-1)- Show that from these relations the values of Ajj 
for all i+j < t can be calculated. 


6.5 Show that a 2-(2n + 1, n, A) design can be extended (by adding one 
element) to a 3-(2n + 2, n+ 1, A) design. 


6.2 Block designs 


A balanced incomplete block design (BIBD) is a nontrivial 2-(v, k, A) design. 
The parameter theorem shows that there is a number r such that each element of 
the set S occurs in exactly r blocks. Letting v = |S|, we rephrase the definition of 
balanced incomplete block design. A (v, b, r, k, A) BIBD is a family C of b 
subsets (blocks or lines or edges) of a set S of v elements (points or vertices) 
such that: 

1. Each point of S lies in exactly r blocks. 

2. Each block has k points. 

3. Each pair of points of S occur together in A blocks (the “balance” 

condition). 

4. Not every k-set of S is a block (the “incompleteness” condition). 

The nontriviality condition becomes 2 < k < v. 


EXAMPLE 6.10 


FC is a (7, 7, 3, 3, 1) BIBD. FC’ is a (7, 7, 4, 4, 2) BIBD. The SQ, 3, 9) 
Steiner system is a (9, 12, 4, 3, 1) BIBD. 


Parameter theorem for block designs. In a (v, b, r, k, A) BIBD, bk = vr 
and r(k — 1) = A(v- 1). 


Proof. These relations follow from the parameter theorem for t-designs upon 
letting t = 2. The first result is obtained by dividing the relation (6.2) by the 
relation (6.3). The second result is immediate from (6.3). 


The relations of the theorem can be proved without using the parameter 
theorem. To prove the first relation, note that the total number of incidences 
between vertices and blocks is bk (from the point of view of the blocks) and also 
vr (from the point of view of the vertices). Therefore bk = vr. To prove the 
second relation, note that the number of times a particular element occurs in 
pairs with other elements is r(k — 1). But it is also A(v — 1), the number of 
elements with which the particular element may be paired multiplied by the 
number of times each pair occurs. Therefore r(k — 1) = A(v — 1). 

The incidence relation between blocks and vertices can be displayed in an 
incidence matrix A = [ajj]p x y. We choose orderings of the points and the 


blocks and let qjj = 1 if the jth point is an element of the ith block and ajj = 0 


otherwise. For instance, the points 1,...,7 and the blocks 246, 167, 145, 257, 123, 
347, and 356 of FC are represented by the incidence matrix 


we aigvigidaisd 
Lee & Ei 2g 
Lie it & & 
Azil 1001 0 Ii. 
Ltt neprgds 
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Theorem. For any (vy, b, r, k, A) BIBD: 
(6.4) det(A‘A) = rk(r — A)°~?. 


Proof. From the block size and balance condition of a BIBD, it follows that 


ee. erro | 
sutiaye | * ; 
: r 
(6.5) Xr ~ A fi 


Subtracting the first row from the others, then replacing the first column with 
the sum of all the columns, we obtain 


r+X(v-1) A | er A 


0 r—-rX OO «=: 0 
det(A‘A) = ’ 
0 0 r—-X 0 
0 wc £ T ped 
[r + (v —1)A](r — A)?! 
= rk(r—)*"?. 


Fisher’s inequality. In any BIBD, b = v. 
Proof. Let A be an incidence matrix of the (v, b, r, k, A) BIBD. Thus, A is a 
matrix of dimensions b x v. Let I be the v x v identity matrix and J the v ~ v all- 
1's matrix. From the previous theorem it follows that A'A = AJ + (r —A)I and det 
ACA = rk(r — ay 1. The relation r(k — 1) = A(v — 1) implies that r > A, which 
means that det(A‘A) # 0. Hence, v = rank(A“A) < min{v, b} < b. 

The extreme case v = b gives rise to an interesting subclass of block designs. A 
(v, k, A) square block design (SBD) is a (v, b, r, k, 4) BIBD in which v = b (and 
hence also k = r). Square block designs are usually called “symmetric block 
designs,” but this is probably not the best term for them as their incidence 


matrices are not symmetric. We believe that the term “square block design” is a 
better choice. Note that in a (v, k, A) SBD we have k(k — 1) = A(v - 1). 


EXAMPLE 6.11 
FC is a(7, 3, 1) SBD and FC’ isa (7, 4, 2) SBD. 


Theorem. The incidence matrix A of a (v, k, A) SBD is normal: A‘A = 


AA®. Thus, any two distinct blocks intersect in exactly A elements. 


Proof. We have A‘A = (k—A)I + AJ. Because det(A“A) ~ 0, it follows that det A ~ 
0, so A”! exists. Therefore 
AAt = AA‘AAq! = A(k — \)IA~! + ANAM! = (k - A) + AATAM, 


Also, JA = kJ implies that JA-+ = k-1! J. Hence AA! = (k — A) + AJ. An 
interpretation of this matrix product yields the desired intersection property. 


If A is the incidence matrix of an SBD, then (det A)2 = k2 (k — ayy lh and so 


det A = k(k — ayv-D/ 2 It follows that if v is even, then k — A is a perfect square. 
This is the first part of the Bruck-Chowla—Ryser theorem, stated below. For a 
proof of the second part, see [12]. 


Bruck—Chowla—Ryser theorem (1949). Suppose that a (v, k, A) SBD 
exists. Then the following two statements hold: 


1. If v is even, then k — J is a perfect square. 
2. If v is odd, then the equation 
a? = (k— d)y? + (—1)@-Y/2)2? 


has a solution in integers x, y, and z, not all zero. 


EXAMPLE 6.12 


There is no (22, 7, 2) SBD, since k — A = 5, which is not a perfect square. 
Let D be a BIBD. The complement D’ of D is obtained by switching 0 and 1 in 
an incidence matrix of D. It is easy to check that the complement of a (y, 5, r, k, 
A) BIBD is a (v, b, b—r, v—k, b— 2r + A) BIBD. Specifically, the complement of 
a(v, k, A) SBD is a (v, v—k, v— 2k + A) SBD. 


EXAMPLE 6.13 
The S(2, 3, 9) Steiner system is a (9, 12, 4, 3, 1) BIBD whose complement is 
a (9, 12, 8, 6, 5) BIBD. 
If p = 3 (mod 4), then we can construct a (p, Poh b**) square design via the set 
Rp of quadratic residues modulo p. If p is any prime greater than 2, then the map 


a Rp with f(x) = x2 is an epimorphism with kernel { —1, 1}, from which it 
follows by the first homomorphism theorem for groups that IRp| = (p — 1)/2. Let 
Np be the set of quadratic nonresidues modulo p, so that |Np| = (p — 1)/2. The 
Legendre symbol (x / p) is defined as 
0 iff=0 (mod p) 
(=) ={ 1 ifreR, 
Pp . : 

(6.6) —-1 ifre Ny. 
Because |Rp| = |Np|, we have X(x/p) = 0 for any sum over a complete residue 
system modulo p. 

Assuming that p = 3 (mod 4), let A be the p x p circulant binary matrix whose 


first row is the characteristic vector of Rp U {0} and whose other rows are 
successive one unit shifts to the right of the first row. We claim that A is the 
incidence matrix of a (p, 4, 2=*) SBD. Evidently, v = b = p andk =r = #£*. We 
only need to check that the dot product of any two distinct rows is 2". The dot 
product of two rows which differ by a shift of j units to the right is 


O12) }42(@ (E944 


where S is a complete residue system modulo p except for the values 0 and —j. 
Since p is congruent to 3 modulo 4, it follows that —1 is a quadratic nonresidue 
modulo p. Therefore, exactly one of j and —j is a quadratic residue and the other 


is a quadratic nonresidue. Letting x + be the multiplicative inverse of x, we have 


EO ED-G4-@ 
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Because x= takes all values except 0 and = it follows that 1 + jx takes all 
values except 1 and 0. Therefore 


a(S) 


For example, with p = 7, the construction furnishes a (7, 4, 2) design 
equivalent to FC’. 

This construction produces square designs with large values of A. Such designs 
are equivalent to Hadamard designs. At the opposite extreme, the next section 
deals with A = 1 designs, which are called projective planes. 


EXERCISES 
6.6 Prove that the complement of a (v, b, r, k, A) BIBD is a (v, b, b—r, v—k, 


b —2r +A) BIBD. 
6.7 Find a circulant incidence matrix for FC. 
6.8 Construct an (11, 6, 3) SBD. 
6.9 Construct a (37, 9, 2) SBD 
Hint: Let one set be the nonzero fourth powers modulo 37. 
6.10 Construct a (16, 6, 2) SBD. 
Hint: Let one set be the cubes in a field of 16 elements. 
6.11 Prove the nonuniform Fisher inequality (R. C. Bose, 1949): Let Cy, 
.-.sCm © {1,...,0} and suppose that |C; 1 Ci = A, where 1 <A <n. Thenm< 
n. 
6.12 (E. R. Berlekamp, 1969) Prove that if Cj,...,C,, are subsets of a t-set 
such that |C;| is odd for all i and |C; 1 Ci is even for all i 4 j, thenn<t. 
Hint: Show that the characteristic vectors of the Cj are, linearly 
independent over the field {0, 1}. 


6.13 Use the previous two exercises to prove the following constructive 
lower bound for diagonal Ramsey numbers (Z. Nagy, 1972): R(t + 1,t+1)> 


(3). 


Hint: Let the vertices of Ky, where v = (3), be the 3-subsets of {1,...,t}. 
Color the edge {X, Y} red |X n Y| = 1 and green if |X n Y| =O or 2. 


6.3 Projective planes 


In a (v, k, A) SBD, suppose that A = 1, and set n = k — 1. Then from the relation 
k(k — 1) = A(v — 1), we have v = n2+n+1.Sucha design is called a finite 
projective plane mt, of order n. Thus, a 7 is an (n2 +n+1,n+1, 1) SBD. The 
elements of 7, are called points and the blocks are called lines. As a square 
design, a finite projective plane of order n has the following properties: 

1. There aren? +n+1 points. 

2. There are n2 +n + 1 lines. 


3. Every line is incident with n + 1 points. 
4. Every point is incident with is n + 1 lines. 


5. Every two points determine a unique line. 
6. Each pair of lines determines a unique point. 


As we have seen with FC, these properties occur in pairs called duals 
properties. If the words “point” and “line” are interchanged, each property is 
transformed into its dual property. 


EXAMPLE 6.14 A projective plane of order 2 


FC is a projective plane of order 2. 
Another description of projective planes comes from linear algebra. For FC, let 
F = {0, 1} be the two-element field, and let V = F 3 = {(x, y, Z) : X, y, Z€ F}, the 
three-dimensional vector space over F. The points of the projective plane are the 
seven 1-dimensional subspaces of V, and the lines are the seven 2-dimensional 
subspaces of V. The reader can check that the above six properties hold. 


Theorem. A projective plane 7, exists for every prime power n. 


Proof. Let F = GF(n), the Galois field of order n. Let the set of points of the 
plane be 


S = {(i,j): i,j € F} U {iz i € F}U {oo}. 
The points i € F are called ideal points. The point © is called the point at infinity. 
The lines are 
lm = {(z,y) € F?:y=mr+b}U{m}, mobeF 
LL ={(k,y): ye F}Uf{oo}, keF 
loo = {m:m € F}U {oo}. 
The line [gg is called the line at infinity. These n2+n+1 points and n2+n+1 
lines constitute a 7). We need only check that the conditions for a (n2 tan 1s 
n2+n+ 1,n+1,n-+41, 1) BIBD are satisfied. Each point (i, j) lies on exactly n 
+ 1 lines, namely, the lines Ij — mj» where me F, and [j. The reader should 
check that the ideal points and the point at infinity each lie on n + 1 lines. That 


each line contains n + 1 points can be seen from the definitions. We leave it to 
the reader to check that each pair of points determines exactly one line. 


EXAMPLE 6.15 A projective plane of order 3 


Let us use the description in the above proof to construct a projective plane 


of order 3 (Figure 6.1). The appropriate base field is F = GF(3), whose 
elements are 0, 1, 2. The 13 points are (i, j), with 0 < i, j < 2, which 
constitute the square array of the figure, and the ideal points i, with 0 <i < 2, 
and oo, which are placed to the side. The lines are lm,b> where 0 < m, b < 2; 


lk, where 0 < k < 2; and loo. 


Figure 6.1 A projective plane of order 3. 
oo 


By the Bruck—Chowla—Ryser theorem, if there is a projective plane of order n 
and if n= 1, 2 (mod 4), then there is a solution in integers to the equation 


2? = ny? — 2”. 


It follows that 


(x/y)” + (z/y)? =n, 
i.e., n is expressible as the sum of the squares of two rational numbers. From 
elementary number theory, it follows that n is the sum of two squares of integers. 


Hence, there is no projective plane of order 6 or 14. However, 10 = 32 4 12, so 
the Bruck—Chowla—Ryser theorem does not rule out the possibility of a 
projective plane of order 10. In 1988 C. W. H. Lam, leading a team, used a 
computer to prove the nonexistence of a projective plane of order 10. See C. W. 
H. Lam, “The search for a finite projective plane of order 10,” American 
Mathematical Monthly, 98 (1991), pp. 305-318. 

As we showed, there exists a projective plane of every prime power order. No 
projective plane of order not a prime power is known to exist, and it is 


conjectured that there is none. 


Open problem. Construct a projective plane of order 12 or show that 
none exists. 


A finite affine plane m’y is a projective plane of order n without the ideal 


points and the line at infinity. A 7’ is an (n2, n2 + n,n+1,n, 1) BIBD. For 
example, a m'3 is a (9, 12, 4, 3, 1) BIBD, which we have already seen is an S(2, 


3, 9) Steiner system. In general, a 7’, is an S(2, n, n2). A projective plane of 
order n exists if and only if there exists an affine plane of order n. See [20]. 


EXERCISES 


6.14 An instructor has 25 students whom she wishes to divide into five 
groups of five students each. The students will be regrouped each class day, 
and the class will meet for a large number of days. How can she do the 
grouping so as to minimize the number of times that any two students are in 
the same group? 
6.15 Let M be the 13 x 13 circulant matrix whose first row is the 
characteristic vector of the set {1, 2, 4,10}. Show that M is an incidence 
matrix for a (13, 4, 1) SBD, i.e., a projective plane of order 3. 
6.16 Draw a projective plane of order 4. 
6.17 Give a counting argument for the order of the automorphism group of a 
projective plane of order 3. 

The automorphism group of a projective plane over a field F is a 


semidirect product Aut F - PGL(3,|F|). If F is GF(q), where q = pk, then 
|Aut F| =k. 


6.4 Latin squares 


A Latin square L of order n is ann x n array [L(i, j)] in which each row and each 
column contains all the elements of Ny. Anr x n Latin rectangle consists of the 
first r rows of a Latin square of order n. 

It is easy to create a Latin square of any order, as in the next example. 


EXAMPLE 6.16 A Latin square of order 3 


To create a Latin square of order 3, take the first row to be 1, 2, 3, and 
sucessive rows to be shifts of the first row. 


123 
2 ww: 4 
3 1 2 


EXAMPLE 6.17 


The Cayley table of a finite group G with elements gj,...,gy yields a Latin 
square L of order n. Let the (i, j) entry of L be k, where 9Ji9j = Gk. For 
instance, G = Z x Z9, with elements (0, 0), (0, 1), (1, 0), (1, 1), yields the 
Latin square 


12934 
21443 
$41 2 
432 1 


Not all Latin squares come from groups. However, Latin squares are equivalent 
to the multiplication tables of primitive algebraic structures called quasigroups. 
We record some interesting definitions. A groupoid is a nonempty set S and a 
binary operation * defined on S. A semigroup is a groupoid in which * is an 
associative operation. A monoid is a semigroup containing a two-sided identity 
element e (x * e = e * x = x for all x). A group is a monoid in which every 


element x in S has a two-sided inverse x 4 (vax ' =a t*er =e) A 
quasigroup is a groupoid such that given a, b € S there exist unique x, y with a * 
x = band y* a=b. A loop is a quasigroup containing a two-sided identity e. The 
literature abounds with examples of these algebraic structures. For instance, 
given any S(2, 3, n) Steiner triple system S we may define a quasigroup whose 
elements are members of S by setting, for a and b distinct, a * b = c, where c is 
the unique element in a triple with a and b, and setting a * a = a. By adding an 
identity element 1 and properly extending the definition of multiplication we 
may turn this quasigroup into a loop. Another example of a loop is the famous 
Cayley loop of order 16. To see some other loops the reader should consult [7]. 
Suppose that we have a quasigroup with n elements, gj,...,gn. By definition, 


each row and each column of its multiplication table is a permutation of the n 
elements. Therefore, replacing g1,...,gy, by the numbers 1,...,n results in a Latin 


square of order n. Conversely, any Latin square of order n is the multiplication 
table of a quasigroup of order n. 
Two Latin squares Lj and L9 are equivalent if Lj can be transformed into L> 
by the following operations: 
1. Reordering rows 
2. Reordering columns 
3. Permuting symbols 


It is an open problem to determine an asymptotic formula for the number L(n) 
of Latin squares of order n (equivalently, the number of quasigroups of order n) 
and the number L* (n) of inequivalent Latin squares of order n. However, the 
following existence theorem allows us to formulate a lower bound for L(n). 


Hall’s marriage theorem (1935). Let S1,...,S, be finite sets. There exist 
distinct s; € S; (for each i) if and only if the following condition holds for 
each k with 1 < k < n: the union of any k of the S; contains at least k 
elements. 


The set {s;} is called a system of distinct representatives (SDR) for the Sj. If S; 


is a list of men whom woman i would like to marry, then an SDR is a feasible set 
of marriages; hence the title of the theorem. 

Proof. The necessity of the conditions is obvious. We must prove that the 
conditions are sufficient. Certainly, there is a representative for the first set. 
Assume that distinct representatives exist for the first k of the sets. We will show 
that an SDR can be found for k + 1 sets. Let Tj be a set which has no 


representative assigned to it yet. If there is an element of Tj not already 


occurring as a representative of one of the other k sets, then we are done. 
Otherwise, note that T; has at least one element, say t;, and suppose that tz 


represents T7. By hypothesis, T; U T> contains at least one element other than 
tj, say tz. If tp is not already a representative, then stop. If tp represents a set T3, 
then find tz ¢ Tj U Tz U T3. Continuing in this manner, we find a collection 
{tj} such that tj ¢ Tj U--- U Tj; and t; represents Tj+ 1 (for i < a), and tg is nota 
representative yet. Now we change some representatives by pairing tg with a set 
Tq’ where a < a’. This process continues until T] is paired with a representative. 
These new pairings, together with the unchanged pairings, constitute an SDR for 


k + 1 sets. 
The following corollary is proved in [12]. 


Corollary. If Sj,...,S, are sets possessing an SDR, and if the smallest set 
has size t < n, then the S; possess at least t! SDRs. 


Lower bound for the number of Latin squares. L(n) > n!(n — 1)! ... 
FANE 


Proof. We will show that for each r, where 1 < r<n-—1, anr x n Latin rectangle 
may be extended to an (r + 1) x n Latin rectangle in at least (n — r)! ways. Given 
an r x n Latin rectangle, let S; be the set of numbers not yet used in column i. 
Clearly, an SDR could be used as the (r + 1)st row of the Latin square. Now, 
each element m, with 1 < m < n, has occurred in r rows and hence in r columns 
of the Latin rectangle thus far. Therefore, each element occurs in exactly n—r of 
the S;. For each k, the union of k of the S; contains k(n — r) elements (counting 
repetitions). As each element occurs in at most n —r of these Sj;, the union must 


contain at least k elements, and the criterion in Hall’s theorem is satisfied. 
Hence, there is an SDR for the S}. 

Because each S; has size n —r, the corollary guarantees the existence of at least 
(n — r)! SDRs. The inequality on L(n) is established by applying the above 
estimate as each successive row is added to the Latin square. 

By a permutation of its rows and columns, any Latin square may be written 
with 1,...,n as its first row and first column. Such a Latin square is said to be 
standardized. If L'(n) is the number of inequivalent standardized Latin squares of 
order n, then L(n) = n!(n — 1)!L'(n). Table 6.1 gives the values of L'(n) for 1 <n 
<7, 


Table 6.1 The number of standardized Latin squares. 


n 12 3 4 5 6 if 
L'(n) 1 1 1 4 56 9408 16942080 


Open problem. Find a formula for L’(n). 


EXERCISES 
6.18 Verify that L'(4) = 4. 


6.19 Use a computer to verify that L'(5) = 56. 


6.5 MOLS and OODs 


Two Latin squares Ly = [L7(i, j)] and Lz = [L9(i, j)] of order n are orthogonal if, 
for every (a, b) € Np x Np, there is an ordered pair (i, j) with (L1(i, j), Lo, j)) = 
(a, b). In other words, the ordered pairs (L1(i, j), Lo(i, j)) take each of the n2 
values in Ny < Np exactly once. The two Latin squares of Figure 6.2 are 


orthogonal. 

Figure 6.2 Two orthogonal Latin squares of order 3. 
i a 12 3 
3 1 2 23 1 
23 1 31 2 


A set of mutually orthogonal Latin squares, or MOLS, is a set in which every 
pair is orthogonal. MOLS are also called pairwise orthogonal Latin squares. We 
define m(n) to be the maximum possible number of MOLS of order n. Leonhard 
Euler (1707-1783) introduced the ideas of Latin squares and MOLS in 1782 
when he asked whether there are two MOLS of order 6. He believed the answer 
is no and therefore conjectured that m(6) = 1. This was proved by G. Tarry in 
1900. Euler also conjectured that m(n) = 1 whenever n = 2 (mod 4), but it was 
shown in 1960 by R. C. Bose, E. T. Parker, and S. S. Shrikhande that m(n) > 2 
except when n = 1, 2, or 6. For example, there are two MOLS of order 10. 
However, it is not known whether there are three MOLS of order 10. 


Theorem. For all n> 2, we have m(n) > n-— 1. 


Proof. Suppose that there is a set of n MOLS of order n. By a permutation of 
symbols, the first row of each Latin square can be changed to 1,...,n, and 
permuting symbols clearly does not disturb orthogonality. Now, by the 
pigeonhole principle, since none of the (2, 1) entries of the n MOLS can equal 1, 
some two Latin squares have (2, 1) entry equal to i, with 2 < i < n. But these 
Latin squares are not orthogonal because the ordered pair (i, i) occurs twice in 
the list of ordered pairs of entries. 


When n = pk for a prime p, we can construct n — 1 MOLS of order n. Suppose 


that F is the field GF(ph), and let F = {0 = fo,...,fn_1}. For each m, where 1 < m 
< n-—1, define the Latin square Ly, = [Ly(i, J)In x yn, where 0 < i, j < n-1, by 
Lm(i J) = ffi + ff. It is a simple matter to check that each Ly is a Latin square. 
To check the orthogonality condition, observe that (fmfit+fj,fnfit fj) = = 
(fmfk + ft: fnft + fr) implies that fj = f, and fj = f], so all the ordered pairs are 
distinct. 

For example, starting with the three-element field {0, 1, 2}, we produce the 
two MOLS 

0-1+0 0-1+1 0-1+2 0-2+0 0-2+1 0:2+2 

1-1+0 1-1+1 1-1+2 1-2+0 1-:24+1 1-2+2 

2-1+0 2-141 2-1+2 2-2+0 2-2+1 2-242 
or 


2 0 2 0 
(two orthogonal Latin squares equivalent to those in Figure 6.2). 


We have described how a projective plane of order n can be constructed from 
the field GF(n). The above argument shows that n — 1 MOLS of order n can be 
constructed from GF(n). In fact, a projective plane of order n is equivalent to a 
set of n— 1 MOLS of order n. 


Theorem. A set of n — 1 MOLS of order n is equivalent to a projective 
plane of order n. 


Proof. Suppose that we are given n— 1 MOLS: L}4,...,Ly—1. Let 

S = {(i,j) € Nn x Nn} U {i: i € Np_1} U {0, oo}, 
and define 

look = {(k, i): iE Np} U{oo}, 1<k<n 

lo“ = {(i,k): iE Na} U{O}, 1l<k<n 

ny = {(3,9): L2(t,9) =y}U{z}, l<ec<n-ll<y<n 

loo = {i 4 € N,1} UU {0,00}. 

We leave it to the reader to check that the incidence matrix for the set of points S 
and the lines loo ks lok lx ys loo iS an (n2 +n+i1,n+ 1,1) SBD. 


Reversing the above construction completes the equivalence. 


The reader may find it instructive to apply the construction technique to the 
two MOLS of Figure 6.2 to construct a projective plane of order 3. It will be 
equivalent to the one of Figure 6.1. 

Let us extend the definition of orthogonality to any two square matrices (not 
necessarily Latin squares). We say that two n x n matrices A and B with entries 
from Np are orthogonal if the ordered pairs (A(i, j), B(i, j)) take each of the n2 


values in Ny, x Np exactly once. Observe that the matrices 


L 2 3S zee mE & ae 2 

lL 2 @ es - we 2 eee 2 
R=]. and C= 

De 2) US pens OE 3 To sass Vi 


are orthogonal. With this generalized definition of orthogonality, we can give an 
elegant characterization of Latin squares: an n x n matrix is an order n Latin 
square if and only if it is orthogonal to both R and C. Thus, any k MOLS of order 
n are part of a family of k + 2 mutually orthogonal matrices. Conversely, any k + 
2 mutually orthogonal n x matrices may be transformed into R, C, and k MOLS 
of order n. For if a matrix M is orthogonal to another matrix, then M contains 
each of the numbers 1,...,n exactly n times. Therefore, choosing two matrices M 
and N from the set of k + 2 orthogonal matrices, we may transform M into R and 
N into C by a simultaneous permutation of the entries of all the matrices. 
Discarding R and C, we are left with k MOLS of order n. 

This characterization replaces the notion of Latinicity with the more essential 
notion of orthogonality. Accordingly, we define an ordered orthogonal design of 


order n and depth s (an (n, s) OOD) to be an s x n2 matrix (mij) with entries 1, 


.... such that every two rows are orthogonal. That is, for every pair of rows u 
and v, every ordered pair (a, b) with 1 < a, b <n occurs exactly once among the 
ordered pairs (myj, Myj). For example, Figure 6.3 shows a (3, 4) OOD derived 
from the two Latin squares of order 3 in Figure 6.2. An OOD is also called an 
OA (orthogonal array). 


Figure 6.3 A (3,4) OOD. 
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The design inherent in the theorem can be produced directly from an (n, n + 1) 
OOD. In general, an OOD-net based on an (n, s) OOD is a collection of m points 
corresponding to the columns of the OOD and s pencils of parallel lines, where 
point x; is on line y of the ith pencil if the (i, j) entry of the matrix is y. The 
OOD-net of an (n, n + 1) OOD is an affine plane of order n which may be 
extended to a projective plane of order n by the addition of one ideal point for 
each row of the OOD and an ideal line. 

In summary, we have shown the equivalence of the following discrete 
configurations: 

» a projective plane of order n; 
«a collection of n— 1 MOLS of order n; 
san (n,n+1) OOD. 


Open problem. Determine the values of n for which these structures 
exist. 


A possible conjecture consistent with what is known is that these structures 
exist if and only if n is a prime power. 


EXERCISES 
6.20 Prove that the n x n arrays A = [aj] and B = [bij], where dij = i+j and 
bij = ] —j (modulo n) are orthogonal, when n is an odd number. 


6.21 Construct a set of three 4 x 4 MOLS (equivalent to a projective plane of 
order 4). 


6.22 Construct a set of four 5 x 5 MOLS (equivalent to a projective plane of 
order 5). 


6.23 Construct a (5, 6) OOD. 


6.6 Hadamard matrices 


In 1893 Jacques Hadamard considered a basic problem about the maximum 
absolute value of the determinant of a matrix with bounded entries. 


Hadamard?’s theorem (1893). Suppose A = [aj] is matrix of order m 


with -1 < aij <1 for all i and j. Then |det A| < mm! 2 and the upper bound 


is obtained if and only if ajj = £1 for all i and j and AAt = ml. 


Proof. The rows of A are vectors in R” of length at most mi! 2 and they span a 
parallelepiped of volume |det Aj. This volume is clearly maximized when the 
vectors are mutually orthogonal and of maximum possible length, and in this 


case the volume is the product of the lengths, mm/2. 


A matrix A = [ajj]m x m with ajj = £1 and AAt = ml is called a Hadamard 


matrix of order m. The condition AA! = mI means that the dot product of any two 
distinct rows of A is zero. (The same is true for columns, as A‘A = Al (AAA = 
A l(mDA = ml.) Therefore, without regard to the volume argument given in the 
proof above, if A is a Hadamard matrix of order m, then m™ = det AA! = (det 
A)2, which implies that |det A] = m'/2. 

Notice that the theorem does not address the question of the maximum value 


of |det A| when there is no Hadamard matrix of order m. However, |det A] does 
attain a Maximum, as it is a continuous function defined on a compact set (the 


cube [ —1, 1]m2). Furthermore, because det A is a linear function of each entry 
dij (i.e., a straight line), the function y = |det A| is concave upward and therefore 


the maximum of |det A| occurs when each ajj = —1 or 1. The maximum 


determinant may also occur for other matrices, in case the coefficient of the first- 
order term in the linear equation just described is 0. For example, 


0-1 1 
1 1 1)/=4. 
—1 » 4 


The Kronecker product A ® B of two square matrices A = [ajj]m Lx m and B 
= [bijlm, x my 1s the square matrix A ® B= [ajjBlm, My X mim): The Kronecker 


product produces larger Hadamard matrices from smaller ones. For example, 
Figure 6.4 shows Hadamard matrices A and B of orders 2 and 4, respectively, 
with B=A @ A. 


Figure 6.4 Hadamard matrices of orders 2 and 4. 


1 1 Lt. “Tl 
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If H is a Hadamard matrix, then so is 


-a a} 


Theorem. There exists a Hadamard matrix of order m = ok where k is 
any positive integer. 


If A is a Hadamard matrix, then any permutation of the rows or columns of A 
is a Hadamard matrix. Also, any row or column of A may be multiplied by —1 
with the result still a Hadamard matrix. With these operations it is possible to 
alter any Hadamard matrix so that its first row and first column consist of all 1's. 
Such a Hadamard matrix is said to be normalized. 


Theorem. If A is a Hadamard matrix of order m > 2, then m is a multiple 
of 4. 


Proof. Normalize A and permute its columns so that its first three rows look like 
this: 


12 ace o E sauce Lee DL 1 

LL 1 1 ...-l -1 ...-1 -1 

i ae ee A eo oe | by we =] we 
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(The variables a, b, c, d are to be determined.) Four equations are immediate 
from the definition of a Hadamard matrix: 


a+b+c+d = m 
a+b-c-d = 0 
a—b+c-—d = 0 


a-b-—c+d = 0. 
Adding the equations yields 4a = m, from which it follows that a = b=c=d= 
m/4, Therefore, m is a multiple of 4, and furthermore we have found that any 
row after the first has m/2 1's and m/2 -1's, and any two rows (not including the 
first) have 1's together in m/4 columns. 


It is conjectured that the condition of the theorem is sufficient as well as 
necessary. 


Conjecture. There exists a Hadamard matrix of every order a multiple of 
4. 


The smallest order for which the existence of a Hadamard matrix is not certain 
is 668. 


Open problem. Determine whether there is a Hadamard matrix of order 
668. 


Theorem. Let H be a normalized Hadamard matrix of order 4n > 8. 
Deleting the first row and column of H and changing each —1 to a 0 
results in an incidence matrix of a (4n — 1, 2n — 1, n — 1) SBD. 
Conversely, starting with an incidence matrix of a (4n — 1, 2n—- 1, n—1) 
SBD, changing each 0 to —-1, and adding a first row and first column of all 
1's, yields a normalized Hadamard matrix of order 4n. 


Proof. Let X be the submatrix of H formed by deleting its first row and column. 
The Hadamard conditions imply that XJ = JX = —J and XX' = 4nI — J. When each 
—1 is switched to 0 a new matrix Y = $ (X + J) results. We check that Y is the 
incidence matrix of a (4n — 1, 2n — 1, n— 1) SBD: JY = YJ = 3(-J + (4n- 1)J) = 
(2n — 1)J and YY! = 1(X + JX" + J) = nI + (n -1)J. The proof of the reverse 
construction is similar. 


EXAMPLE 6.18 


The (7, 3, 1) SBD is equivalent to a Hadamard matrix of order 8. 
A (4n — 1, 2n — 1, n— 1) SBD created this way is called a Hadamard design of 
order n. A 3-(4n, 2n, n — 1) design may be formed by taking complements of a 
Hadamard design H together with the blocks of H joined to a new point ©. 

Hadamard designs and projective planes are two extreme types of (v, k, A) 
designs. For if a (v, k, A) design exists, then 

(6.7) 4n-—l<v< net+nt l, 
where n = k -A. To prove this inequality, we let A’ = v— 2k + A. Then A + A’ = v— 
2n and AM' = n(n — 1). The upper bound follows from the observation that A’ > 1, 
and the lower bound from the arithmetic mean—geometric mean inequality, (A + 


a2 > 4)A'. One can show that the upper bound is met if and only if the design is 
a projective plane of order n (or its complement) and the lower bound is met 
only for a Hadamard design of order n (or its complement). 

Let H be a normalized Hadamard matrix of order m = 4n, with each —1 
changed to 0, and let A be the code consisting of the rows of H and the binary 
complements of these rows. Clearly, A is a code of dimension 4n containing 8n 
codewords. We claim that the distance of A is 2n. We are guaranteed that any 
two rows of H disagree in exactly 2n places. Therefore, any two rows of the 
binary complementary matrix H' disagree in 2n entries. Suppose that a € H and b 


e H’. If a = b®, then d(a, b) = m. If not, then d(a, b) equals the number of 


components in which a and b© agree, which is 2n. This (4n, 8n, 2n) code is 
called a Hadamard code. It is capable of detecting 2n — 1 errors and correcting n 
— 1 errors. 


EXAMPLE 6.19 
The (7, 3, 1) SBD yields an (8, 16, 4) code. We saw this code in Chapter 5. 


EXAMPLE 6.20 


From the complement of the quadratic residues construction for p = 11, we 
obtain the incidence matrix of an (11, 5, 2) SBD: 
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From Y we construct a normalized Hadamard matrix of order 12. Thus 


~-1 -1 1 1 1 —1 1 -1 -l 1 -l 
—-1 -1 -1 1 j 1 -1 1 -1 -1 ik 
1 -1 -1 -!l 1 1 | i | 1 -1 -1 
1 -l 1 -1 -l1 -l 1 I 1 -l 1 -1 
We shall see in the next section that Y is one of the main ingredients in 
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producing a perfect (23, 212 7) code. Switching —1 back to 0, the rows of H and 
their complements constitute a (12, 24, 6) code which detects five errors and 
corrects two errors. The words of this code are listed below. 
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The families of codes we have encountered can be arranged on a continuum 
from good rate/bad distance to bad rate/good distance, as in Figure 6.5. At the 


left extreme is the code F”, in which every vector is a codeword. The rate is 1, 
but the code is incapable of correcting any errors. At the right extreme is a code 


consisting of any vector v and its complement v°. Although this code has the 
highest possible distance, d = n, its rate is 1/n, the lowest possible. The family of 
Hamming codes are capable of correcting e = 1 error, and the rates tend to 1. 
Therefore, the Hamming codes converge to the left endpoint of the continuum. 
For the Hadamard codes, 


Figure 6.5 The world of codes. 
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(6.8) co ee 4n = An 
Since their rates converge to 0, the Hadamard codes converge to the right 
endpoint of the continuum. 


— 0 (asn — oo). 


The (23, 212 7) Golay code G33, which we will construct in the next section, 
has rate 12/23 and corrects e = 3 errors. 

We conclude the discussion of combinatorial designs by constructing three 
large, interesting, related configurations, namely, the (23, 212 7) Golay code 
G23 (the only perfect multi-error-correcting binary code), the S(5, 8, 24) Steiner 
system (consisting of the weight 8 codewords of the extended Golay code G24), 
and Leech’s lattice £ (a 24-dimensional lattice obtained from G 4 which 
generates a surprisingly dense sphere packing). 


EXERCISES 


6.24 Construct a Hadamard matrix of order 8. Change this into a (7, 3, 1) 
SBD, i.e., a Fano Configuration. Show that the code produced from this 
matrix has parameters (8, 16, 4). 


6.25 Show that the determinant of a square matrix is a linear function of 
each of its entries. That is, if A = [ajj]m x m and ajj is a fixed number for 


each (i, j) except (ig, jg), then there exist numbers @ and f, not depending on 
dinjgp such that det A = aaj,j, + p. 


6.26 Use the Kronecker product to construct a Hadamard matrix of order 16. 
Construct a (15, 7, 3) SBD. What code does this design give? 


6.27 Construct a (19, 9, 4) SBD. What code does this design give? 


6.7 The Golay code and S(5, 8, 24) 


We have indicated that the only feasible parameters for perfect binary codes with 
e > 1 are (90, 278 5) and (23, 212 7). An exercise calls for a proof that no (90, 


278, 5) code exists. We will construct a (23, 212 7) code called the Golay code, 
G23, which was discovered by Marcel Golay (1902-1989). It is a perfect 3-error 


correcting code with 2i2 words, sitting inside F' 23 Asa bonus, we will find that 
certain words in the extended Golay code G7, constitute a Steiner system S(5, 8, 
24). 

Let M be an (11, 6, 3) SBD based on quadratic residues modulo 11. Let G be 
the 12 x 24 matrix 


1 0 
1 0 
1 0 
1 0 
1 0 
] Thi 0 M 
1 0 
1 0 
1 0 
1 0 
1 0 


POC ero Pita tii ria dd 
where [71 is the order 11 identity matrix. 


The matrix G defines a linear transformation G: F!2 + F 24 with x + xG. The 
(linear) code G7q is the image {xG: xEF 12) We call G a generating matrix for 
G74. It is easy to use a computer to produce the codewords of G7, and thereby 
verify that it is a (24, 212. 8) code. But we can do so without a computer as 


follows. 


We leave it to the reader to use elementary row operations to reduce G to row 
echelon form and thus show that G has rank 12. It follows that Gy, has 


parameters (24, 212 d), where d is to be determined. We proceed to find the 
weight distribution of G24. 


Recall from Exercise 5.3 that if x and y are any two vectors of the same length, 
then w(x + y) = w(x) + w(y) — 2x -y. 
If rj and rj are different rows of G, then rj - rj is even. This follows from the 


row dot product property of M. 

Now we show that if x € Gj4, then w(x) is a multiple of 4. Any codeword is a 
linear combination (over F’) of the rows of G, so we can write x =ry+ro+::: 
ry (with relabeling of rows). We use induction on n. If n = 1, then by inspection 
w(x) is a multiple of 4 (the last row has weight 12 and every other row has 
weight 8). Now if x =ry, +rg+°+°+++rprp+y1 then 

w(x) = wri t+ + tn) + w(ragi) — (ri +++ rn) tn41, 
which, by remarks made earlier, is a multiple of 4. This completes the induction. 

Therefore, the possible weights of words of Gp, are 0, 4, 8, 12, 16, 20, 24. 
Clearly, 0 € G74 and w(0) = 0. Also, ry + 1r9 + +++ +194 = (1,...,1), so there is a 
word of weight 24. It follows that the binary complement of any codeword is 
also a codeword and therefore the weight distribution of G24 is symmetric. This 
distribution, as we have found so far, is given by Table 6.2. The variables a, f, y 
have yet to be determined. 


Table 6.2 The partially determined weight distribution of G24. 
weight 0 4 8 12 16 20 24 
numberofwords 1 a § y 8B a 1 

Suppose that x € Gj4. Let L(x) be the left-hand string of length 12 of x and 
R(x) the right-hand string. We can now represent x as x = [L(x), R(x)]. We claim 
that [R(x), L(x)] € Gog; that is, Gpq is invariant under the permutation of 
coordinates 

r = (1 13)(2 14)(3 15)... (12 24). 

We say that G74 is self-symmetric. Let v’ denote the vector obtained from v by 
switching the right and left halves. We leave it to the reader to show that for 
each row r of G, the vector r’ is a linear combination of the rows rj. It follows 
that G74 is invariant under tT, as every codeword of G74 is a sum of rows 17. 

Next we observe that w(L(x)) is even whenever x € G24, because the sum of k 
rows when k is even yields w(L(x)) = k and the sum when k is odd yields w(L(x)) 
=k+1. 

Because w(L(x)) + w(R(x)) is a multiple of 4, it follows that w(R(x)) is always 
even. 

Finally, we will show that d => 8, from which it follows that d = 8, because the 
first row of G has weight 8. We need only show that no codeword has weight 4. 


If x has weight 4, then (w(L(x)), w(R(x))) = (0, 4), (4, 0), or (2, 2). If w(L(x)) = 0, 
then x must be rz), but then w(R(x)) = 8 > 4. Because G7, is self-symmetric, the 


(4, 0) case is ruled out also. For the (2, 2) case, we can sum any one or two of the 
first 11 rows of G and then add or not add the 12th row. In each case the 
resulting codeword has weight 6, not 2. 


Therefore, Gpq4 is a (24, 212 8) code. Deleting any coordinate of G4 
produces the Golay code G73, a code with parameters (23, 212 7%. 


Theorem. The Golay code G73 is a (perfect) (23, 212 7) code. 


Let us complete the weight distribution table of Gy4. We know that G94 has 
no codewords of weight 4 or 20. How many words have weight 8? If w(x) = 8, 
then (w(L(x)), w(R(x))) equals (0, 8), (2, 6), (4, 4), (6, 2), or (8, 0). A glance at G 
shows that (0, 8) is impossible, and hence by self-symmetry (8, 0) is impossible. 
To obtain a weight partition (2, 6), we can add one or two rows of the first 11 
rows of G and then add or not add the 12th row. There are 2((‘) + (1})) = 132 
possibilities. Likewise, there are 132 ways to obtain a codeword with weight 
partition (6, 2). To get a (4, 4) weight partition, we add either three or four rows 
of G. The number of choices is (';) + (‘;) = 495. Altogether, the number of 
words of weight 8 in Gq is 132 + 132 + 495 = 759. 


We display the complete weight distribution of Gj, in Table 6.3. 


Table 6.3 The weight distribution of G24. 
weight 0 8 12 16 24 
numberofwords 1 759 2576 759 1 

We are now ready to find the Steiner system S(5, 8, 24) sitting inside G4. 
From the parameter theorem for S(5, 8, 24), we obtain As = 1, Aq = 5, A3 = 21, Ad 
= 77, Aq = 253, and Ag = 759. As Ag is the number of blocks, it is clear that the 
words of weight 8 in G74 should form the blocks of S(5, 8, 24). Let S be the set 


of coordinates 1,...,24 and let C be the collection of sets of eight coordinates 
which equal 1 in codewords of weight 8 in G74. We need to check that the t= 5 
and A = 1 conditions are met. This means that every five-element subset of S is 
contained in exactly one block in C. Equivalently, every vector of weight 5 is 
covered by exactly one codeword of weight 8 in G24. 


A vector v of weight 5 cannot be covered by two different codewords x and y 
of weight 8, or else w(x + y) < 6, a contradiction. 

Therefore, it remains to demonstrate that there are enough codewords of 
weight 8 to satisfy the A = 1 condition. There are 3, vectors of weight 5, and 
each of the 759 codewords of weight 8 covers (2) of them. A simple calculation 
shows that (%*) = 759(8). Therefore, S and C make up the desired Steiner system 
S(5, 8, 24). 


Theorem. The weight 8 codewords of Gj4 are an S(5, 8, 24) Steiner 


system. 
EXERCISES 
6.28 Show that the weight distribution of G23 is: 
weight 0 7 8 ll 12 15 16 23 


numberofwords 1 253 506 1288 1288 506 253 1 
6.29 Use the generating matrix G and a computer to find the codewords and 


weight distribution of the (24, 212 8) code. 


6.30 Find the parity check matrix of the (24, 212 8) code. 

6.31 Show that a Steiner system S(5, 6, 12) can be constructed by fixing a 
codeword x of weight 12 in Gj, and taking all codewords which intersect x 
in six places. 

6.32 Show that M19, the automorphism group of the above Steiner system, 
has P(12, 5) = 12!/7! elements. 


6.8 Lattices and sphere packings 


An n-dimensional lattice ¢ is a subset of R" such that: 

1.0 =(0,...,0)€ 4; 

2. xX € £ implies that —x € £; 

3.x, y€Limply thatx+ yee. 
We assume that £ contains a point other than the origin (hence ¢ is infinite) and 
that ¢ is discrete, which means that ¢ has a finite intersection with any compact 


subset of R™. 


The Euclidean distance between two points x and y in R" is 

(6.9) (x,y) = ((ar — 1)? + +++ + (tn — Yn)?), 
and the norm of x is 

(6.10) ltl} = d(x, 0) = (af +--+ 29 )'/?. 
If £ is a discrete n-dimensional lattice containing a non-origin point x, then the 
Euclidean sphere {y € R": 0 < |\y|| < ||z\|} contains only finitely many points of 
£. The minimum positive distance between 0 and a point in this intersection is 
the minimum distance of £, denoted d(c). Two lattice points are neighbors if they 
are separated by this minimum distance. Because a lattice is clearly translation 
invariant, any base point determines the same minimum distance d(c) to a 
neighbor. If spheres of radius $d(c) are centered at each lattice point, these 
spheres touch only at their surfaces. Such a placement of spheres is called a 


lattice sphere packing of R". 


It is desirable to know how densely spheres may be packed in R" via a lattice 
packing or otherwise. Related to this question is the computation of the 
maximum number of spheres touching a given sphere in a lattice packing. 
Specifically, let c(z), the contact number of £, be the number of spheres of 
radius $d(L) which touch the sphere centered at the origin. Equivalently, c(c) is 
the number of lattice points at distance d(c) from 0. 

Figure 6.6 shows part of a two-dimensional lattice with distance 2 and contact 


number 4. This lattice is called (22), as it consists of all points in the plane with 
even integer coordinates. In general, the n-dimensional lattice (2Z)" consists of 


all points in R” with even integer coordinates. 


Figure 6.6 A lattice packing with contact number 4 and density + 0.79. 


We can easily calculate the amount of the plane enclosed by the circles. 
Consider a square of area 4 whose vertices are four lattice points (as indicated in 
the figure). Such a square is called a fundamental region of the packing, as this 
region repeats periodically to give the complete pattern of the packing. Because 
the square contains four circular quadrants whose total area is 7, we say that the 
density of the packing is 7/4 + 0.79. However, the densest possible packing in 


R2 is not based on the lattice (22), but on the triangular lattice T shown in 
Figure 6.7. The fundamental region of T is an equilateral triangle of side length 2 
and area V3. As this triangle contains three sixths of a unit circle, the density of 
the packing is m/(2V3) = = 0.90. The contact number, 6, is also greater for T than 


for (2Z)2. Because a unit sphere in the (2Z)" packing has two neighbors in each 
of n directions, this lattice has contact number 2n. We shall see that the contact 
number can be greatly improved in 24 dimensions. 


Figure 6.7 A lattice packing with contact number 6 and density = 0.90. 


Open problem. Find, for each k, the highest possible contact number of a 


lattice sphere packing in RK 


EXERCISES 


6.33 Show that the contact number of Z” is 2n. 
6.34 What is the maximum possible contact number c(¢) for a lattice 


packing ¢ in R3? What is the density of this packing? 


6.9 Leech’s lattice 


In 1965 John Leech (1926-1992) proved that the extended Golay code G74 can 
be used to construct a 24-dimensional lattice with a remarkably high contact 
number. Leech’s lattice c¢ is defined as follows. For each codeword c = (cq, 
...sC24) € G74 and each integer m, let £(c, m) be the set of integer 24-tuples (x1, 
...X94) for which 

(A) o%42i=4m and 

— m (mod 4) ifc;=0 

m+2 (mod4) ife,=1. 

Leech’s lattice c is the union of all the £(c, m). 

We need to verify that c is a lattice. Because the all-0 string is a codeword in 
G74, it follows that 0 € c (with m = 0). If x ec, then x € £(c, m) for some cE G74 


and some integer m. It is easy to show that —x € £(c, —m), and hence —x € L. We 
will show that if x, y € £ with x € £(c, m) and ye £(d, n), then x+y € c(c+d, m+n). 
Clearly, S3(ai+y:) = 21+ ye =4m+4n = 4(m+n), so condition (A) is satisfied. 
As for condition (B), if cj and dj are of opposite parity, then cj + dj = 1 and x; yj; 
=m-+n-+ 2 (mod 4). If cj and d; have the same parity, then cj + dj = 0 and x; + yj 
=m-+n/(mod 4). Therefore, condition (B) is satisfied and £ is a lattice. 

So far, we have not used any particular property of the Golay code other than 
its linearity. 

We now calculate the distance d(c) of Leech’s lattice by finding the smallest 
value of {|x| = (2? +---+23,)'/2 for a lattice point other than the origin. Condition 
(B) implies that all the x; are even (if m is even) or all the x; are odd (if m is 


odd). If all the x; are odd and some 4; satisfies |x;| = 3, then }* 2? > 23(1) + 32(1) 


= 32. We shall soon see than 32 is, in fact, the minimum value of IIx{|2. Recall 
that the codewords of G74 have weights 0, 8, 12, 16, and 24. We say that they 


have shapes 024 91648 gl2412 98116 and 124. If |x;| = 1 for all x;, then x has 
shape (+1)@ (-1)? where (a, b) is one of (24, 0), (16, 8), (12, 12), (8, 16), (0, 24). 
Therefore & x; = 24, 8, 0, -8, or —24. In each case, the sum is 4m with m even, a 
contradiction. Hence, the minimum value of ||x|| for odd x; is V32 and is 
achievable only by lattice points of shape $123 43. 

Now suppose all the x; are even. If |x;| > 4 for some xj, then X x? > 62 > 32, so 
we can disregard these vectors and assume |x;| = 0, 2, or 4. If there is at least one 
x; with |x;| = 2, then there are at least eight (by examining the shapes of the 
codewords). Therefore & x2; > 8-22 = 32. If always |x;| = 0 or 4, then |x;| = 4 for 
at least two x;, or else & x; = +4 = 4(41), contradicting the fact that m is even. If 
|x;| = 4 for more than two x;, then ||x|| is too large. Thus, the minimum value of 


\|x|| for even x; is V32 and is achievable only by lattice points of shapes 016 4 78 


and 022 + AZ. 


Theorem. The distance of Leech’s lattice is V32. The neighbors of the 
origin have shapes +123 +3, 916 + 28 and 022 + 42. 


We now calculate the contact number c(£) of Leech’s lattice, noting first that 
each lattice point comes from a unique choice of c and m. The codewords that 
generate these lattice points have weights 8 or 16. Recalling Table 6.3, there are 
759 codewords of weight 8 and 759 of weight 16. Since the lattice points of 
shape 016 4 78 satisfy the m even condition, there are an even number of +2 and 
—2 components. Suppose that the number of +2’s is a and the number of —2’s is 8 
— a. Then the possibilities for (a, b) are (8, 0), (2, 6), (4, 4), (6, 2), (0, 8). The 
first, third, and fifth of these come from codewords of weight 8 and the others 
from codewords of weight 16. Thus, there are 2’ - 759 lattice points of shape 
016 + 98. 

How many lattice points have shape 022 + 422 Any choice of signs for the two 
4's satisfies the m even condition. Therefore, because there are (*) choices for 
the placement of the 4's (lattice points of this shape come from the all-0 


codeword and the all-1 codeword), there are 224) lattice points of shape 022 + 
42, 

To find the number of lattice points of shape +123 + 3, let z be the number of 
+1's ina lattice vector. Then 4m = z(1) — (23 — z)+ 3 = 2z — 23 + 3. This equation 


forces the +3 to occur if z is even and the —3 if z is odd. Any of the 2i2 
codewords generates a choice of +1’'s and —1's. The position of the +3 may be 
chosen in 24 ways, and once it is chosen the sign is determined by the previous 
comment. Thus, there are 212 . 4 lattice points of shape $123 43. 


Therefore c(c) = 2 - 759 + 22 (24)+ 212 . 24 = 196,560. 
This is the highest possible contact number for a lattice packing in R24, 
proved by A. M. Odlyzko and N. J. A. Sloane in 1979. 


as 


Theorem. The contact number of Leech’s lattice is 196560. Neighbors of 
the origin occur with the following multiplicities: 

shape number 

o'+2° 2’ .759 = 97152 

0744? 27(%) = 1104 

41% 43 2)*.24 = 98304. 


We end the discussion with this respectable achievement, although the story of 
combinatorial designs is far from finished. The interested reader can turn to [27] 


and [8] for a description of the groups which J. H. Conway found in connection 
with ¢. Conway denoted the automorphism group of c by -0 (pronounced 
“dotto”) and defined -1 to be -0 divided by its center. It turns out that -1, along 
with two other automorphism groups, -2 and -3, are sporadic simple groups. The 
order bf -1 is 

24) 3? 5%. 7%. 11-13-28. 
Most simple groups belong to well-known families such as Ay, PSL(n, q), or the 
groups of Lie type. However, 26 simple groups do not fall into these categories 
and are therefore called sporadic simple groups. Several of the 26 sporadic 
simple groups are associated with Aut c, including the largest, the Monster 
group, a group of symmetries in a space of 196,884 dimensions. The Monster 
group has order 

246 . 320.59. 7°. 11%. 139 -17-19- 23-29-31 -41-47-59- 71. 


EXERCISES 


6.35 What is the highest possible contact number in a lattice packing in R4? 
6.36 Show how to use the (8, 16, 4) extended Hamming code to construct a 


lattice packing in R8 with contact number 240. 
The relationship between the (8, 16, 4) code and the lattice Eg, with highest 


contact number (240) in RS. is discussed in [8]. 


Notes 


In 1987 Luc Teirlinck proved the existence of t-designs for all values of t. 
However, there are many open problems. See the survey “Simple t-Designs with 
large t,” by Donald L. Kreher, at http://www.math.mtu.edu/~kreher/. 

Fisher’s inequality was proved by the statistician and biologist Ronald A. 
Fisher (1890-1962). The Bruck-Chowla Ryser theorem was proved in 1949-— 
1950 by Richard H. Bruck (1914-1991), Sarvadaman D. S. Chowla (1907— 
1995), and Herbert J. Ryser (1923-1985). 

Philip Hall (1904—1982) published the marriage theorem in 1935. See [13] for 
a description of some equivalent theorems, such as the Kénig—Egervary theorem 
and Menger’s theorem. The K6nig—Egervary theorem states that the minimum 
number of rows and columns which cover the 1's in a 0—1 matrix equals the 


maximum number of row-and column-independent 1's in the matrix. Menger’s 
theorem states that the minimum number of points which separate two given 
nonadjacent points in a finite connected graph equals the maximum number of 
edge-disjoint paths which connect the two points. 


The problem of finding the highest possible contact number for a lattice in RK 
is very much unsolved. Conway and Sloane [8] call this the kissing number 
problem and give a wealth of results. They also consider the related packing 
problem, the covering problem (in which one wants the least dense covering), 
and the so-called quantizing problem (which has application to data 
compression). These subjects abound with open questions. Although the obvious 


packing in R? was proved by R. Hoppe in 1874 to have the highest possible 
contact number (see [1]), it was not proved until recently to be the densest 
possible packing. This result, known as Kepler’s conjecture, was proved in 1998 
by Thomas Hales’ at the University of Michigan. See 
http://www.math.|sa.umich.edurhales/. 

In R4, the lattice packing with highest contact number, 24, is the Dy lattice. 


We may take one sphere with center at the origin and the other spheres with 
centers of the form (+1, +1, 0, 0), where we can take any combination of signs 


and the 0's can occur in any two of the four coordinates. This accounts for 22. (3) 


= 24 neighbors of the origin. These neighbors are the vertices of a 24-cell, one of 
the six 4-dimensional Platonic solids. 


In R8, the lattice packing with highest contact number is the Eg lattice. Let C 
be the (8, 16, 4) extended Hamming code. It has 14 codewords of weight 4. 
Define the lattice to be the set of 8-tuples (x1,...,xg) such that x; = cj (mod 2) for 
1 <i < 8 and each codeword c; € C. The neighbors of the origin have the form 
(+2, 0, 0, 0, 0, 0, 0, 0), where the +2 can be in any of the eight positions, or (+1, 
+1, +1, +1, 0, 0, 0, 0), where we can take any combination of signs and the four 
0's correspond to the positions of 1's in the weight 4 code words. There are 2 - 8 
+14-24=240 neighbors of the origin. The number of lattice points at distances 


0, 2, 4, 6,... from the origin are the coefficients of the remarkable theta series 
formula 


A(q) =1+ 240} © o3(n)q?” = 1 + 240q? + 21609* + 6720¢° +---, 
n=1 


where 03 (n) is the sum of the cubes of the positive divisors of n. 


In 1861 and 1873 Emile Mathieu (1835-1890) discovered five sporadic simple 
groups, M11, M19, M22, M33, and M4, which are related to Steiner systems. 


The groups M94 and M93 are the automorphism groups of the S(5, 8, 24) and 


S(4, 7, 23) Steiner systems, respectively. The group M99 has index 2 in the 


automorphism group of the S(3, 6, 22) Steiner system. The groups M79 and M14 


are the automorphism groups of the S(5, 6, 12) and S(4, 5, 11) Steiner systems, 
respectively. These groups may also be defined in terms of permutations. For 
example, letting x =(1234567891011), y=(56410)(11 8 3 7), andz=(1 
12)(2 11)(3 6)(4 8)(5 9)(7 10), we can write M11 = (x, y) and M19 = &, y, z). We 
have |Mj1|=8-9- 10-11 and |Mj9|=8-9- 10-11-12. With 7920 elements, 


Mj 1 is the smallest of the 26 sporadic simple groups. 
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APPENDIX A 


WEB RESOURCES 


Websites: 


a http://www.combinatorics.org/Surveys/ 

“The Electronic Journal of Combinatorics: Dynamic Surveys,” a site 
containing up-to-date studies of combinatorics. The article Small Ramsey 
Numbers, by Stanislaw Radziszowski, shows the known values and 
bounds for Ramsey numbers. 


ahttp://en.wikipedia. org/wiki/Combinatorics 

“Wikipedia: Combinatorics,” a site developed by users, containing 
material about various aspects of combinatorics and links to other 
sources. 


ahttp://en.wikipedia.org/wiki/Discrete mathematics 

“Wikipedia: Discrete Mathematics,” a site developed by users, containing 
material about various aspects of discrete mathematics and links to other 
sources. 


ehttp://mathworld.wolfram.com/Combinatorics. html 
“MathWorld: Discrete Mathematics,” a site containing material about 
various aspects of combinatorics and links to other sources. 


ahttp://mathworld.wolfram.com/DiscreteMathematics. html 
“MathWorld: Discrete Mathematics,” a site containing material about 
various aspects of discrete mathematics and links to other sources. 
ahttp://oeis.org/ 

“The On-Line Encyclopedia of Integer Sequences,” a site containing 
integer sequences contributed by users. The sequences may be searched 
by initial terms, by name, or by attributes. 


APPENDIX B 
NOTATION 


n! n factorial, p. 4 

N {1, 2, 3,...}, p. 6 

nN” {1, 2, 3,...,m}, p. 6 

is binomial coefficient, p. 8 

( re ie ‘silts ) multinomial coefficient, p. 9 

B(n) Bell number, p. 22 

p(n) partition number, p. 22 

p(n, k) partition number, p. 22 

dn derangement number, p. 24 

Fy Fibonacci number, p. 31 

Ly Lucas number, p. 37 

{r} Stirling number of the second kind, p. 40 
[T Stirling number of the first kind, p. 41 
dp(n) digital sum, p. 46 

Ch Catalan number, p. 64 

tn number of transitive and reflexive relations, p. 69 
Pn number of partial orders, p. 69 

Zn cyclic group, p. 86 

Sn symmetric group, p. 87 

Dn dihedral group, p. 88 

An alternating group, p. 88 

g(n) number of nonisomorphic graphs, p. 101 
n(k, d) lattice point function, p. 112 

n'(k, d) generalized SET® function, p. 112 

6(g) degree of vertex, p. 115 

G complement graph, p. 115 

Kn complete graph, p. 115 


Km,n complete bipartite graph, p. 115 


a(G) 

x(G) 

R(m, n) 
R(qq,...,dc) 
R(qq,...4¢5 0) 


[sé 

S(c) 

Wc, 1) 
FC 

GL(n, q) 
SL(n, q) 
PGL(n, q) 
PSL(n, q) 
t-(v, k, A) 
S(t, k, v) 


(v, b, r, k, A) BIBD 


(v, k, A) SBD 
™ 


’ 
ly 


infinite complete graph, p. 115 

infinite complete bipartite graph, p. 115 
cycle, p. 115 

path, p. 115 


independence number, p. 116 
chromatic number, p. 116 
Ramsey number, p. 133 


multiple color Ramsey number, p. 135 


hypergraph Ramsey number, p. 136 


complete t-uniform hypergraph, 136 
Schur number, p. 145 

van der Waerden number, p. 148 
Fano Configuration, p. 162 

general linear group, p. 164 

special linear group, p. 164 

projective general linear group, p. 165 
projective special linear group, p. 165 
t-design, p. 171 

Steiner system, p. 173 

balanced incomplete block design, p. 175 
square block design, p. 177 

projective plane, p. 180 


affine plane, p. 182 


Latin square, p. 183 

mutually orthogonal Latin squares, p. 185 
ordered orthogonal design, p. 188 

Golay code, p. 194 

extended Golay code, p. 194 


Leech’s lattice, p. 200 


EXERCISE SOLUTIONS 


SOLUTIONS FOR CHAPTER 1 


1.5 There are 299 binary strings of length 99. Half of these, 298 have an odd 
number of 1's. The reason is that, regardless of the first 98 bits, the last bit 
may be chosen to make the total number of 1’s even or odd. 


1.7 There are n” such functions. 

1.17 There are 2'2 different binary relations on an n-set. 

1.20 Since the tenth row of Pascal’s triangle is 1, 10, 45, 120, 210, 252, 210, 
120, 45, 10, 1, we have (a + p10 = a0 + 10q9b + 45a8p2 + 
120a’b3+210a°b4 + 252a°b? + 210a4b® + 120a3b7 + 45a2b8 + 10ab9 + 
p10. 

1.21 By the binomial theorem, the coefficient is (77) = 184,756. 

1.25 By the multinomial theorem, (a + b + c)4 = a4 + 4a%b + 6a2b2 + 4ab3 
+ b+ + 4a3c 12a2bc + 12ab2c + 4b3c + 6a2c2 + 12abc2 + 6b2c2 + 4ac? + 
Abc? + c4. 

1.26 By the multinomial theorem, the coefficient is (37.10) = 22,170,720. 
1.27 By simplifying, we see that 


n 
fee td 
_ {n\(n—ki\ (n— kh — ke n—k; —kg—-++—km-2 
“em dk & 2 ee 


1.30 (a) The number of paths is Gio, ne = 3;200, 760. 
(b) The number of paths is (,4 $3 99) = 10,361,546,974,682,663,760. 


a (*) 7 € *) 
1.32 The identity to prove is equivalent to k=2 3 
The right side is the number of selections of three distinct elements from the 
set {1,...,.9 + 2}. Let the largest such element be k + 1, where2<k<n+1. 
Then, for each k, the other two elements must be selected from the set {1, 


..k}, and the number of ways to do this is [e). The left side counts these 
selections for all k. 

1.45 Clearly, Sg(n) = n. The formula for Sj (n) can be obtained by noting 
that 2S4(n) = (n + 1)n and hence Sq (n) = n(n + 1)/2. To find S9(n) and 


S3(n), we use the following technique. The sum £”;—4[(i + kt] — jk+1] is 


a telescoping series. Hence 
k+1 ~(k+1 
(n+1)**? “t= Deh itt) = ae je =>>( )$s(n). 
i=1 j=0 j=0 \ J 
Therefore 
k-1 
= -E (5500). 


Now we find 


= 5 [(@+8-1- (p)n- (7) |- a(n ajGn-+1) 


nnd |orear—a- (Da ()aaee-(ueenaen 
=(% n+ oe 


The fact that Rie is a monic polynomial in n of degree k + 1 is clear from 


and 


our method. 
1.46 A k-dimensional face of an n-dimensional hypercube is a subset of the 
2" vertices of the cube isomorphic to a k-dimensional hypercube. There are 


(;) choices for the coordinates on which to base the face. The other 
coordinates can be 0 or 1. 


1.47 The total number of pizzas is 


2+12-1 3+12-1 4+12—1 
veins 2PH2) 4 (HIN 419—1) san 


1.49 (a) The number of such functions is (" ). 
(b) The number of such functions is ("*"~"), 
1.50 We have {?} = 1, = {8} = 15, {8} = 25, {3} = 10, {2} = 1, and B(S) = 52. 


1.52 The formula {3} = 2n-1_ 4 holds since any subset of {1, 2, 3,...,.n—1}, 
except the set itself, together with n, constitutes a part in a partition of {1, 2, 


3,...,} into two subsets. The formula {,.",} = (5) for n = 2 holds since in a 
partition of {1, 2, 3,...,n} into n — 1 parts, the element n is in any one of n— 
1 parts and there are (5) choices for which part it is in. 


1.53 There are 1029/4 + 1 choices for z, namely, all the integers from 0 to 
1039/4, Among these choices, the average value of 4z is 1039/9. Hence, on 
average, x+2y = 1029/2. There are 1029/4 + 1 values of y that satisfy this 


equation, namely, the integers from 0 to 1029/2. Once z and y are chosen, 
the value of x is determined. Therefore, the total number of nonnegative 


( 103° ) 
integer solutions is \ 4 
1.56 (a) Let S be the set of functions from {1, 2, 3,...,m} to {1, 2, 3,...,n}. 


Then S has cardinality n. We wish to find the cardinality of the subset of S 
consisting of all onto functions. For 1 < i < n, let A; be the collection of 


functions whose range does not contain i. Then the intersection of j of the Aj 


has cardinality (n — j)", the number of unrestricted functions from {1, 2, 3, 
..m} to the n — j nonexcluded elements of {1, 2, 3,...,n}. Applying the 
inclusion—exclusion principle, we obtain 


|A, U-+-U Aal = eae (n—j)™. 
The number of ae functions is the complement of this union: 


m= |A1U---UAn| =A" 5 wt" in am = Dray (") (nay 


j=0 


n 1 : k-j k n 
x = (-) i) l<k<n. 
b= 


1, 4, 13, 34, 73, 136, 


(b) From (a), we obtain 


G: tay 2s 224 
1.60 We take successive differences: 6, 6, 6, 6, 
Having obtained a constant sequence, we stop. We find that the 
(n) = 1+3n+ (5) +6(3) =n? + 2n+1. 
polynomial is 2 3 
1.61 We can think of a pair of linked sets (A, B) as an onto function from 
S to the set {A, B, A nm B} or to the set {A, B, AN B, AUB}. So the total 


number of pairs of linked sets is T(n, 3) + T(n, 4). 
1.65 The Fibonacci numbers are sums of “shallow triangles” of Pascal’s 


* —2 
| (y+ (" )+(" Jee iu eee 
triangle: \9 1 2 
= wink, 


We prove the identity by induction. For n = 0, we have (0 
Assume that the identity holds for n and n — 1. Then 


Po) *C)+C a) 
od MC) +02 eo 


= £'n+2:- 
Hence, the identity holds for n + 1 and by induction for all n. 
1.68 The number 210 occurs six times in Pascal’s triangle: 
210 = ee _ Coal _ () _ es 5 os - oy! 

1 209 3 7 2 14 
1.74 A particular solution to the recurrence relation is ay = -n —3. Hence, 
the general solution is of the form @n = A¢” + Bd” —n— 3. 


We need to choose A and B so that the initial values are satisfied. Thus 
0=A+B-3 


1 = Ag+ Bo -4. 
We find that A = (15 + 7V5)/10 and B = (15 — 7V5)/10. 
1.86 We have [,,",| = (5), as this counts the ways of choosing two 


elements to be in the same cycle. 
1.88 In a partition of n + 1 elements, the (n + 1)st element is together with 
n—k other elements for some k with 0 < k <n. There are (,.",) choices for 
these n — k elements. Hence 
_ n “.f{n 

] = = Bik = [, 
B(n +1) 2 i. é ) BH) 2 « (k), n>O 
1.89 From the recurrence relation for Stirling numbers of the second kind, 


aed 
stb (E}-fe2)) 


a betel 


k=1 k=1 
= B(n+1)-—1-—- (B(n) - 1). 
Therefore, the expected number of parts is 
1 ws fn) Bint+1)- Bln) 
Be af -~ Bay 


k=1 
1.90 Upon the substitutions k > -n + 1 andn > -k + 1, the recurrence 
relation (1.32) becomes the recurrence relation (1.33). 
1.91 The relation follows by subtracting 1 from each part in the partition 
of n into k parts. 
1.107 Obviously, FiylFim- Applying the identity 
Prtn = FPeFatit fm-1Fn, m21,2n20 
(Exercise 1.64 with n = m), we obtain F2m = Fm41 Fm + FmFm-1, 
and hence Fy|Fom. Similarly, with n = 2m, we_ obtain 
F3m = Fim+2m = Fim+1F2m + FmF2m-1; 
and hence Fy |F 37. Continuing in this manner, we find that Fy | Fkm for 
allk > 1. 
Now suppose that n = mk + r, with 0 < r < m. Then, applying the 
identity again, we obtain fn = Fmk+r = Fmk+1Fr + FimkFr-1- 
If Fy | Fy, then, since Fry | Fmk, it follows that Fy | Fmk-+1 Fr. But 
consecutive Fibonacci numbers are coprime, and so Fy, and Fypk+1 
are coprime. Hence Fy | Fr, but this is possible only if r = 0 (since Fy 
> F,). Therefore m | n. 


1.108 Since gcd(a, b) | a, it follows from the previous exercise that 
Fgcd(a,b) | Fa- Likewise, Fgcd(a,b) | Fb. Hence Fgcd(a,b) | 8cd(Fa; 
Fp). 

There exist integers x and y such that gcd(a, b) = ax + by. Without 


loss of generality, assume that x is negative and y is positive (they 
cannot both be positive). Hence 8¢4(a, 6) + a(—x) = by. 


By the identity of the previous exercise, 
Foy = Fyca(a,b)+a(—2) = Fgca(a,b)+1Fa(—2) + Fgca(e,b) Fa(—2)—1- 
Thus, Fgcd(a,b)Fa(—x)+1 is a linear combination of Fg and Fp and 


hence divisible by gcd(Fg, Fp). Since Fq and Fq_x)+1 are 
relatively prime, it follows that gcd(Fq, Fb) | Fgcd(a,b):- 
Therefore F'gcd(a,b) = 8Cd(Fa; Fp). 


SOLUTIONS FOR CHAPTER 2 
2.1 The sum is 


(f'(@))le=y = 
2.4 The generating function is x(x((1 — x) by = x(x + 1) - 


x) 3. 
2.12 A recurrence formula is ag = 1, aq = —3, and ay = —3ayn-1 


—dp_? for n > 2. Notice that ay = (-1)"Fop+0. 
2.14 The identity follows upon summing f(x), with x replaced 
by oly and weighted by wo JP for 0 < j <q-1, using the 


relation +@+@2+---+@I-1=0. 
2.17 We can find a simple formula for t(n) where n has any 
given remainder upon division by 12. Thus 
t(12k) = 3k? 
t(12k + 1) = 3k? + 2k 
t(12k +2) = 3k? +k 
t(12k + 3) = 3k? + 3k +1 
t(12k + 4) = 3k? + 2k 
(12k + 5) = 3k? + 4k +1 
t(12k + 6) = 3k? + 3k +1 
t(12k+7) = 3k? +5k+2 
t(12k + 8) = 3k? + 4k +1 
t(12k + 9) = 3k? +64 +3 
t(12k + 10) = 3k? +5k+2 
t(12k + 11) = 3k? 4+ 7k +4. 


Since these formula are all quadratic polynomials, t(n) satisfies 
the recurrence relation 
t(12k + r) = 3¢(12(k — 1) + r) — 3t(12(k — 2) +r) + t(12(k — 3) +r). 
The desired recurrence relation follows immediately. 
2.19 First, we prove that the period of {t(n) mod m} is at most 
12m. If n is even, we may write n = 12m + 2r and we have 
t(12m + 2r) = {(12m + 2r)?/48} = 2m? + mr + {(2r)*/48} = t(2r). 
Here, {x} denotes the nearest integer to x. The odd case is 
similar; it follows that t(12m+u) = t(u) mod m for all u. Hence, 
the period of {t(n) mod m} is a divisor of 12m (and therefore at 
most 12m). 

Second, we show that the period of {t(n) mod m} is at most 

12m. 


Suppose that the period is p, and let k = p or p— 1 so that k 
is even. We then have {k2/48} = 0 (mod m) and {(k + 
2)2/48} = 0 (mod m) since t(-1) = t(0) = t(1) = t(2) = 0. 
Thus m divides {(k + 2)2/48} — {k2/48} which is nonzero 
because p > 12. This difference is less than (k + 2)7/48k2/48 
+ 1, which implies that 12m < k + 13 < p + 13, thus 
completing the proof since 12m is a multiple of p and p > 
12: 
2.21 We will prove the result by induction on n. The claim is 
true for n = 0 since Cg = 1 and 0 = 21 _ 1. Assume that the 
claim holds for all Catalan numbers up to Cy. Now consider 
(n—1)/2 

Casi = >. CiGwk, 

Cy+1. If£n + 1 is even, then k=0 


which is even. If n + 1 is odd, then 
n/2 
Cati =) CkCn—k + Cry 
k=0 
So Cy4+1 is odd if and only if Cy/2 is odd. By the induction 


hypothesis, this means that n/2 = 2k _ 1, for some integer k. 
Thus, Ch is odd if and only if 
n+1=2(2* —1)+1=2*t! -1. 


By mathematical induction, the claim holds for all 
nonnegative integers n. 

2.22 From the recurrence relation Cp, = 2% Cp_1, we have 
(n+ 1)Cy =(n+1)Ch-1 (mod 3). 

If n 2 (mod 3), then Cy # Cy_1 (mod 3). Letting n = 3k, 
we have C3; # C3k-1 (mod 3). Letting n = 3k + 1, we have 
C3k+1 # C3 (mod 3). Therefore 
C3k-1 = C3k = C341 (mod 3). 

2.27 Clearly, the formula holds for n = 0 and n= 1. Assume 
that it holds for n. Then 
(e+ yr) =(e + y)™ (2+ y) +) 


sot (feu +k)+y+(n—k)| 


m 
n n 
(k+1), (n—k) (k),,(n—k+1) 
@E y +5 (i) y 


k=0 k=0 
n 


+1 n 
lis M\ 1.(k), (n—k+1) 1 \ _.(k),(n—k+1) 
x(a i? y 2p » k wy 


= . n n (k), (n+1—k) 
> [(ens) + (a) 


ys yr*)) ZO) ie gimt1}4(0) 


n+l 
a Ss (" + ‘a (n+1—k) 
—— k y a 


k=0 
Thus, the formula holds for n + 1 and hence for all n by 
induction. 


The proof of the formula for (x + y\(n) is similar. 


2.31 There are 48,639 such walks. It is easy to generate an 8 
x 8 table by starting with 1 at al and at each cell adding the 
left, lower, and lower-left neighbors. This produces the 
Delannoy numbers. The main diagonal Delannoy numbers 
are 1, 3, 13, 63, 321, 1683, 8989, 48639. 


2.32 A recurrence relation is 


q(m, n) =2q(m — 1, n) + 29(m,n — 1) — g(m —1,n—1) 
— 3q(m — 2,n— 1) — 3q(m — 1,n — 2) + 4q(m — 2,n — 2), 
m>3orn>3. 

The initial values are 

q(0,0) = 1; g(0, 1) = 1; q¢(0,2) = 2 

gid—laeti =i 2)—F 

q(2, 0) = 2; g(2, 1) = 7; q(2, 2) = 22. 
The corresponding generating function is 

l—~2—y+2?4+ 2y? — 2? 

1 — 2x — 2y + vy + 3x2y + 32y? — 4? y?” 
2.43 For x € X, let Ny (neighborhood) be the intersection of 
all sets in S that contain x. Since S is closed under unions 
and complements, it is closed under intersections. Hence Ny 
e€ S. The sets Ny partition X. Therefore, the number of 
algebras is equal to the number of partitions of X. (a) If X is 
labeled, then the number is B(n). (b) If X is unlabeled, then 
the number is p(n). 


2.44 The number of functions is n”. Given any element x € 
{1,...,n}, there are (n—1)" functions that do not include x in 


their range. Altogether, there are n(n—1)" of these elements x 


r(n) = n-n(1-=] P 
(with multiplicity). Hence " 
and so 
r(n) aw 


lm —=1 : 
nm—ooc Th € 


2.52 There are 3210 such necklaces. 

2.56 From the generating function in Example 2.22, we see 
that there are 16 circular necklaces with five white beads 
and five black beads. 

2.63 There are 34 nonisomorphic graphs of order 5. 

2.69 (a) We will show that the coefficients of x{*- ++ 2° in 
both expressions are equal. Let k = lay +--+ + + map and 
Qm+1 = °° *° = Qk = 0. In the expression on the left, the 


contribution to 2f'- - - 2” comes from the n = k term, 


1 ke 1 
kl k 9X3 gy! - k qajry.t 
which is © []j=19°@3!  T]j=19°03! 
In the expression on the right, the factor x;’ comes from the 
av’ /(jMa;!) term in the jth factor of exp 
yaa) = [i ote te ee) tory = 1, *+> m. Hence, 
1 


the overall contribution is TT; =1 J*5 0,3! 

2.70 There are two such graphs of order 5 and none of order 
6. A self-complementary graph of order n > 1 has n(n — 1)/4 
edges. For this to be an integer, n must be of the form 4k or 
kA, 

2.74 It is approximately 1.3 x 101332, 


SOLUTIONS FOR CHAPTER 3 
3.1 By the pigeonhole principle, there exist m and n, 
with m <n, such that 17” 4 17" (mod 104). Since 
gcd(17, 104) = 1, we have 17" 4 1 (mod 10%). 
We can actually find such an exponent using 

Euler’s theorem: a?(™) # 1 (mod m) if gcd(a, m) = 

1. The furnished exponent is (104) = 4000. 
3.3 To show that this result is the best possible, let S = 
{1,...,100} and take Aj to be any subset of S with 67 
elements. Let the other Aj be cyclic shifts of A 1. In this 
system of sets, each x € S is contained in exactly 67 of 
the Aj. 
3.23 Let Gj consist of three disjoint copies of Ky and 
Gz be G with one edge added. 
3.24 If a(G) and x(G) were both finite, then the number 
of vertices in G would be bounded by their product (a 
finite number). Since the number of vertices is infinite, 
at least one of a(G) and x(G) is infinite. 
3.25 Given any two vertices x and y of G, the degree 
conditions imply that x and y have a common neighbor 


z, and hence there is a path from x to y. Therefore G is 
connected. 

3.26 Suppose that the longest path contained in G had 
fewer than n vertices. Then, choosing any two 
consecutive vertices in the path, say, x and y, the degree 
condition implies that x and y have a common neighbor 
z. Using z, we can make a longer path. Hence, the 
longest path in G has n vertices. The same type of 
argument allows us to close the path to make a cycle of 
length n. 

3.27 The result follows by mathematical induction. 
Removing an end vertex and its incident edge from the 
graph amounts to subtracting 1 from both sides of the 
relation p = q+1. Continue this process until the graph 
contains only one vertex; the relation certainly holds for 
this graph. 

3.33 Take n disjoint copies of a total order on m 
elements. 

SOLUTIONS FOR CHAPTER 4 

4.2 We can take f(n) = R(n, n). Number the vertices 1, 2, 
...f(n), and color the edge {i, j}, where i <j, green if i is 
directed to j, and red otherwise. Ramsey’s theorem 
guarantees a monochromatic subgraph Ky. The 


corresponding directed subgraph is a_ transitive 
subtournament. 

4.10 Take G = Koo, 00 00: Call the parts of the tripartite 
graph A, B, and C. Color all edges between A and B 
green and all other edges red. 

4.11 Let the graph be the cycle Cg together with all four 
diagonals. 

4.13 Given a 3-coloring of K17, let v be any vertex. 
There are 16 edges incident with v. By the pigeonhole 
principle, at least six of these edges are the same color, 
say, green. Suppose that v is incident to vertices x1, x9, 


X3, X4, X5, X6 by green edges. If any edge {xX;, Xj} is 


green, then we have a green K3. If not, then we have a 
red K3 or a blue K3 (since R(3, 3) = 6). To finish the 
argument, show a 3-coloring of Ky 1g with no 
monochromatic triangle. 

4.15 From the probabilistic method, we obtain the lower 


bound 3.0038 x 1016, 

4.23 False. Take one part to be the union of {1}, {3, 4}, 
{7, 8, 9}, {13, 14, 15, 16}, etc. 

4.26 An example of a 6-AP of primes is 7, 37, 67, 97, 
127, 457; 


SOLUTIONS FOR CHAPTER 5 
5.1 The contribution from each component to both sides 
of the identity is 0 if xj = yj and 1 if x; 4 yj. 
5.4 0000, 0011, 0101, 0110, 1001, 1010, 1100, 1111 
The code detects one error. 
5.6 Such a code would require that 


10 10 
9. 1 < 910 


a contradiction. 
5.11 The probability that the Hamming code corrects 
one error is @ = (1 — q)’ + 7(1 — q)°q = 0.96. 
If no code is used, then the probability that the message 
is transmitted correctly is 4 = (1 ~ 4)* = 0.81. 
5.12 Append an eighth component to the Hamming 
code such that the entry makes the total number of 1’s 
even. This ensures that the distance of the code is 4. 
5.25 The group is GL(3, 2). 


SOLUTIONS FOR CHAPTER 6 
6.2 A projective place of order 4 is a 2-(21, 5, 1) 
design. 
6.7 Let the first row of the matrix be [1101000] 
and take cyclic shifts. 


6.11 If |Cj| = A for some i, then every other Cj 
contains Cj, som<n+1-—A<n. 

Suppose not. Let y; = |Cj| —A > 0 for 1 <i<m. 

For the incidence matrix A, we have 


A+ A r — r 
A A+72 A re r 
det(AA‘) : : : : 
X A A+%m-1 A 
» vee A A A+ Ym 


= ..-¥%m[1+ A/mm +-+-+1/%m)) £0. 
6.20 The result follows from the observation 
that, for any given a and fB modulo n, there exist 
unique (mod n) i and j such that i + j =a and i—j 
= Pp. For 2i=a + PB and 2j =a — B, this determines 
unique i and j since n is odd. 


6.21 
13 2 4 123 4 123 4 
2441 83 21 4 3 43 21 
423 1 3.4 1 2 2143 
oe 4321 3 4 1 2 


6.33 Each lattice point has all n_ integer 
coordinates. Its lattice neighbors are those points 
with the same coordinates save for one change to 
an integer one greater or one lesser. Hence, each 
lattice point has 2n neighbors. 
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